Postgresql – Can Postgres partition table by column values to enable partition pruning

partitioningpostgresql

I'm using Hive currently and it has partitioned tables.
Let's say this is my data:

Country City DailySale
USA     NY   200.2
USA     SF   190.5
CAN     TR   390.4

The I can define my table


create table SALES_DATA(
Country    STRING,
City       STRING,
DailySale  BIGDECIMAL
) PARTITIONED BY (Country, City)

Then I create catalog structure on HDFS:

SALES_DATA/
          Country=USA/
                      City=NY/
                              data-file-0001
                              data-file-0002
... the rest omitted for brevity...

Then partition pruning can be applied for queries like

SELECT SUM(DailySale) FROM SALES_DATA
WHERE  Country='USA' and City='SF'

Hive engine will prune all partitions except one. Super effective and optimal.

I would like to implement smth similar on Postgres. i'll have immutable tables with aggregated data and would like to apply smth similar for them.
I'm trying to go through it
https://www.postgresql.org/docs/13/ddl-partitioning.html
but don't really follow if it's what I need.

Best Answer

In such a case having an index on country and city will probably perform much better than partitioning.

If you are really set on using partitioning, postgres always uses the CHECK constraints to prune partitions. A way to approach this is to:

  1. Create a function that based on the "country" & "city" input will create a table that inherits from salesdata (if the table already exists return it)
  2. If the table is newly created add a check constraint on the values of country & city.
  3. Create a trigger that executes before/after insert into salesdata that will run that function and insert into the country & city partition.

When following this procedure and you do a select on SalesData postgres will prune the tables not matching the check constraint.