Sql-server – Will table partitions help me

columnstoreindexpartitioningsql server

I have 10 tables.

dbo.Table2008
dbo.Table2009
...
dbo.Table2018

Each table holds approximately 500,000,000 records which are 20 columns wide (if that matters). And each table has a clustered columnstore index on it.
Each table also holds data only for the year in its name. For instance, in dbo.Table2008 you'll only see records for which the CreatedDate column is >= 20080101 and <= 20081231. But if I want to issue a query across 3 years I need to hit all three tables with a union.

My theory is if I have a single table called dbo.Table which is partitioned into ~120 partitions, one partition for every month of every year, then it'll not only reduce my table count. So I can change my query from

with cte as (
    select col1, col2, col3, col4
    from table2008
    where col4 >= 20080201 and col4 <= 20120801
    union
    select col1, col2, col3, col4
    from table2009
    where col4 >= 20080201 and col4 <= 20120801
    union
    select col1, col2, col3, col4
    from table2010
    where col4 >= 20080201 and col4 <= 20120801
    union
    select col1, col2, col3, col4
    from table2011
    where col4 >= 20080201 and col4 <= 20120801
) select ...
from cte
join LookupTable1 on ... = ...
join LookupTable2 on ... = ...

to

select col1, col2, col3, col4, lookuptablecol1, lookuptable2col2
from dbo.Table
join ALL MY LOOKUP TABLES

Is my theory correct?
Is table partitioning with parition schemes and partition functions what I'm after?
And will this theory work using clustered columnstore indexes?

Best Answer

Partitioning does not sound like a good fit for your scenario. Partitioning isn't something to be taken lightly. It requires a lot of planning and you'd have to rewrite 5 billion rows in order to convert the data. Instead you should consider a UNION ALL view. Depending on your requirements you could even make it a partitioned view.

A view will give you an easy way to query your data without any date movement. It's by far the simplest solution.

In general, partitioning doesn't improve query performance. However, with columnstore tables it can be helpful if it gives you better rowgroup elimination on the partitioned column. That can be accomplished without partitioning depending on how you build your columnstore tables. If you don't modify the old data it should be relatively straightforward to build your columnstore in a way that maintains order.

Columnstore tables do support partitioning, if you do have a legitimate need for it at some point. It's easy to experiment with it. Simply try to create an empty, partitioned columnstore table.