Postgresql – How to merge partitions in Postgres

optimizationpartitioningperformancepostgresqlquery-performance

I have a table that is partitioned by date on a monthly basis beginning January 1, 2014. There is significantly more rows of data in 2015 than there were in 2014 and the data in 2014 is not accessed frequently, so I am considering merging the rows from 2014 into a single partition. My assumption is that this will improve the query optimizer and help prevent too many partitions from being created over time. The steps I was planning to do that are:

Create a new partition called xyz_2014
Copy the data from 2014 into xyz_2014
Alter table xyz_2014 to inherit from xyz
Alter old tables xyz_2014_01, xyz_2014_02, etc. to not inherit from xyz
Update my INSERT trigger for xyz to reference xyz_2014
Confirm that everything is working correctly
Delete old tables

Is this approach typically how merging tables in Postgres is done?

Best Answer

If you are still inserting data for 2014, then you risk problems with this method, because rows inserted between steps 2 and 5 are going to end up getting dropped rather than moved.

If you are not still inserting data for 2014, then I think you should change step 5 to "rewrite the trigger to throw an error upon insertion of 2014 data" and move it up to be step 0. That would remove the doubt.

But it does seem like you are doing a lot of tinkering. If you want to remove the partitioning for 2014, why would you want to keep it for 2015? Why did you implement partitioning in the first place and why is that reason no longer valid for 2014 (but still valid for 2015)? Getting rid of the partitioning might speed up the queries, but it might not. Reorganizing partitioned tables isn't something you should do on a hunch. Do you have a QA system you can use to time the queries and see if they get faster? And wouldn't you want faster queries for the newer data at least as much as for the older data?

Related Solutions

Sql-server – Best practices for selecting partition boundary ranges to maintain empty partitions at both end

Let me start by mentioning that these 4 boundaries result in following 5 partitions. Partitions 1 and 5 are unbounded one one side and partitions 2, 3, and 4 are constrained to quarterly data:

partition 1: values less than '2014-04-01'
partition 2: values >= '2014-04-01' and < '2014-07-01'
partition 3: values >= '2014-07-01' and < '2014-10-01'
partition 4: values >= '2014-10-01' and < '2015-01-01'
partition 5: values >= '2015-01-01'

The rationale for keeping (at least) the first and last partitions empty is to ensure expensive data movement is not required during partition maintenance of an incremental or sliding window scenario. If data must be moved to a different partition to accommodate new boundaries during MERGE or SPLIT, the logging is about 4 times greater than normal DDL operations. This is especially nasty with large tables, as is common with table partitioning.

Data movement is never required when an existing empty partition is split or when 2 adjacent empty partitions are merged. This is why the best practice is to plan for such, making the maintenance a no-brainer that can easily be automated.

That said, one can split a non-empty partition without overhead in some cases. For example, you could split the function to add the new '2015-04-01' boundary without data movement as long no existing rows are >= '2015-04-01'. But to also avoid an expensive scan during the offline operation, you need an index with the partitioning column as the high-order key column so that SQL Server can efficiently validate no data exists beyond the new boundary. So in your case, you could probably split the last partition in advance of the start of the next quarter without incurring overhead if the partitioning column is indexed (a common design pattern).

Assuming your objective is 4 quarterly partitions of historical data plus the current active quarter in a sliding window, I suggest bounded partitions for each of these 5 quarters plus another bounded one for the future quarter. This will result in 8 partitions when you include the unbounded first and last ones.

CREATE PARTITION FUNCTION PfDateRange(date)
AS RANGE RIGHT FOR VALUES(
      '2014-01-01'
    , '2014-04-01'
    , '2014-07-01'
    , '2014-10-01'
    , '2015-01-01'
    , '2015-04-01'
    , '2015-07-01');

The reason one might want additional future empty partitions is to provide time to react in case partition maintenance is missed for some reason. Multiple future partitions are more important with more frequent maintenance (e.g. daily), and less so for quarterly maintenance.

After the start of the '2015-04-01' quarter, the partition maintenance script can:

switch '2014-01-01' to a and empty staging table and truncate staging
merge the '2014-01-01' partition to remove it
spilt at '2015-10-01' for the next future quarter

You may want to do a sanity check before the maintenance in case you have bad data outside of the expected range (less than '2014-01-01' or greater than '2015-07-01'). Alternatively, maintain a check constraint for the expected range so that only valid data can be inserted.

Sql-server – Star join query optimization – Change partitioning, use columnstore

Thanks for adding the query plan; it is very informative. I have a number of recommendations based on the query plan, but first a caveat: don't just take what I say and assume it's correct, try it out (ideally in your testing environment) first and make sure you understand why the changes do or don't improve your query!

The query plan: an overview

From this query plan (as well as the corresponding XML), we can immediately see a few useful pieces of information:

You are on SQL 2012
This is a classic star join query and you are getting the benefit of the in-row bitmap filter optimization that was added in SQL 2008 for such plans
The fact table contains about 1.5 billion rows, and just over 500 million of those rows match the dimension filters
The query requests 72GB of memory, but is only granted 12GB of memory (presumably, 12GB is the max that will be granted to any given query, meaning your machine likely has ~64GB of memory)
SQL Server is performing a sort-stream aggregate that takes 500 million rows down to just 600,000 rows. The sort is exceeding it's memory grant and spilling to tempdb
We have warnings for plan-affecting converts due to either explicit and implicit conversions in your query
The query uses 32 threads, but the initial seek into your fact table has an enormous thread skew; just 2 of the 32 threads do all of the work. (At subsequent steps in the query plan, however, the work is more balanced.)

Optimization: columnstore or not

This is a tough question, but on balance I would not recommend columnstore for you in this case. The primary reason is that you are on SQL 2012, so if you are able to upgrade to SQL 2014 I think it might be worth trying out columnstore.

In general, your query is the type that columnstore was designed for and could benefit greatly from the reduced I/O of columnstore and the greater CPU efficiency of batch mode.

However, the limitations of columnstore in SQL 2012 are just too great, and the tempdb spill behavior, where any spill will cause SQL Server to abandon batch mode entirely, can be a devastating penalty that might come into play with the large volumes of rows you are working with. If you do go with columnstore on SQL 2012, be prepared to baby-sit all of your queries very closely and ensure that batch mode can always be used.

Optimization: more partitions?

I don't think that more partitions will help this particular query. You are welcome to try it, of course, but keep in mind that partitioning is primarily a data management feature (the ability to swap in new data in your ETL processes via SWITCH PARTITION and not a performance feature. It can obviously help performance in some cases, but similarly it can hurt performance in others (e.g., lots of singleton seeks that now have to be performed once per partition).

If you do go with columnstore, I think that loading your data for better segment elimination will be more important than partitioning; ideally you will probably want as many rows in each partition as possible in order to have full columnstore segments and great compression rates.

Optimization: improving cardinality estimates

Because you have a huge fact table and a handful of very small (hundreds or thousands of rows) set of rows from each dimension table, I would recommend an approach where you explicitly create a temporary table containing only the dimension rows that you plan to use. For example, rather than join to Dim_Date with a complicated logic like cast(right(ALHDWH.dwh.Dim_Date.Financial_Year,4) as int) IN ( 2015, 2014, 2013, 2012, 2011 ), you should write a pre-proccessing query to extract only the rows from Dim_Date that you care about and add the appropriate PK to those rows.

This will allow SQL Server to create statistics on just the rows you are actually using, which may yield better cardinality estimates throughout the plan. Because this pre-processing would be such a trivial amount of work compared to the overall query complexity, I would highly recommend this option.

Optimization: reducing thread skew

It's likely that extracting the data from Dim_Date into it's own table and adding a primary key to that table would also help to reduce thread skew (an imbalance of work across threads). Here's a picture that helps show why:

In this case, the Dim_Date table has 22,000 rows, SQL Server estimated that you are going to use 7,700 of those rows, and you actually only used 1,827 of those rows.

Because SQL Server uses statistics in order to allocate ranges of rows to threads, the poor cardinality estimates in this case are likely the root cause of the very poor distribution of rows.

Thread skew on 1,872 rows may not matter much, but the painful point is that this then cascades down to the seek into your 1.5 billion row fact table, where we have 30 threads sitting idle while 600 million rows are being processed by 2 threads.

Optimization: getting rid of the sort spill

Another area I would focus on is the sort spill. I think that the primary problem in this case is poor cardinality estimates. As we can see below, SQL Server thinks that the grouping operation being performed by the combination of a Sort and Stream Aggregate will yield 324 million rows. However, it actually yields just 643,000 rows.

If SQL Server knew that so few rows would come out of this grouping, it would almost certainly use a HASH GROUP (Hash Aggregate) rather than a SORT GROUP (Sort-Stream) in order to implement your GROUP BY clause.

It's possible that this may fix itself if you make some of the other changes above in order to improve cardinality estimates. However, if it doesn't you could try to use the OPTION (HASH GROUP) query hint in order to force SQL Server to do so. This would let you evaluate the magnitude of the improvement and decide whether or not to use the query hint in production. I would generally be cautious about query hints, but specifying just HASH GROUP is a much lighter touch than something like using a join hint, using FORCE ORDER, or otherwise taking too much of the control out of the query optimizer's hands.

Optimization: memory grants

One last potential problem was that SQL Server estimated that the query would want to use 72GB of memory, but your server was not able to provision this much memory to the query. While it's technically true that adding more memory to the server would help, I think there are at least a couple other ways to attack this problem:

Get rid of the Sort operator (as described above); it's really the only operator consuming any substantial memory grant in your query
Split your query up into multiple batches; it may be the case that you can run the query once per partition, for example. This could reduce the size of the sort, keep it in memory, and potentially improve performance significantly. A side benefit could be that you might get better utilization of threads if you access just one partition since this does impact the way that SQL Server allocates threads to partitions in some cases.

Best Answer

Related Solutions

Sql-server – Best practices for selecting partition boundary ranges to maintain empty partitions at both end

Sql-server – Star join query optimization – Change partitioning, use columnstore

Related Question