Postgresql – How to merge partitions in Postgres

optimizationpartitioningperformancepostgresqlquery-performance

I have a table that is partitioned by date on a monthly basis beginning January 1, 2014. There is significantly more rows of data in 2015 than there were in 2014 and the data in 2014 is not accessed frequently, so I am considering merging the rows from 2014 into a single partition. My assumption is that this will improve the query optimizer and help prevent too many partitions from being created over time. The steps I was planning to do that are:

  1. Create a new partition called xyz_2014
  2. Copy the data from 2014 into xyz_2014
  3. Alter table xyz_2014 to inherit from xyz
  4. Alter old tables xyz_2014_01, xyz_2014_02, etc. to not inherit from xyz
  5. Update my INSERT trigger for xyz to reference xyz_2014
  6. Confirm that everything is working correctly
  7. Delete old tables

Is this approach typically how merging tables in Postgres is done?

Best Answer

If you are still inserting data for 2014, then you risk problems with this method, because rows inserted between steps 2 and 5 are going to end up getting dropped rather than moved.

If you are not still inserting data for 2014, then I think you should change step 5 to "rewrite the trigger to throw an error upon insertion of 2014 data" and move it up to be step 0. That would remove the doubt.

But it does seem like you are doing a lot of tinkering. If you want to remove the partitioning for 2014, why would you want to keep it for 2015? Why did you implement partitioning in the first place and why is that reason no longer valid for 2014 (but still valid for 2015)? Getting rid of the partitioning might speed up the queries, but it might not. Reorganizing partitioned tables isn't something you should do on a hunch. Do you have a QA system you can use to time the queries and see if they get faster? And wouldn't you want faster queries for the newer data at least as much as for the older data?