Postgresql – Why does PostgreSQL FDW talk to multiple backend servers for this point query

postgresqlpostgresql-fdwsharding

I've set up a PostgreSQL FDW server with the following table, sharded by user_id over four servers:

CREATE TABLE my_big_table
(
    user_id bigint NOT NULL,
    serial bigint NOT NULL,         -- external, incrementing only
    some_object_id bigint NOT NULL,
    timestamp_ns bigint NOT NULL,
    object_type smallint NOT NULL,
    other_type smallint NOT NULL,
    data bytea
) PARTITION BY LIST (mod(user_id, 4)) ;
CREATE SERVER shardA
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host '192.168.200.11', port '5432', dbname 'postgres', fetch_size '10000');
 .
 .
 .
CREATE SERVER shardD
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host '192.168.200.14', port '5432', dbname 'postgres', fetch_size '10000');
create foreign table my_big_table_mod4_s0 partition of my_big_table
    for values in (0) server shardA
    OPTIONS (table_name 'my_big_table_mod4_s0');
 .
 .
 .
create foreign table my_big_table_mod4_s3 partition of my_big_table
    for values in (3) server shardD
    OPTIONS (table_name 'my_big_table_mod4_s3');

Given a query for a single user_id, I was hoping for FDW to select a single backend based on simple partition pruning, but explain shows a foreign table scan against all four servers… How can I hint FDW to be smarter?

Best Answer

PostgreSQL lacks the insight into the "mod" function that it would need to declare that user_id=97 also implies that mod(user_id,4)=1. If you provide that insight manually, it would likely honor it:

WHERE user_id=$1 and mod(user_id,4)=mod($1,4)

This has nothing to do with FDW. If all partitions/tables were local, the answer would remain the same.

You could use hash partitioning, then it would generate the insight automatically.