Sql-server – Expanding a partitioned table

partitioningsql serversql-server-2012

I have a table that holds four years of data, where for each month I have a partition that is based on a filegroup with two files. Unexpectedly, I received eight more years of data and prepared the database files and file groups for these years.

The current scheme in place covers 200901 – 201512 and I now need to add 200001 – 200812. What is the best way to bring this online for the partitioned table?

Create a new partition scheme and partition function and change the clustered index to this new partition scheme; or
Can I somehow alter the existing partition scheme and partition function?

I'm asking because the current table has 18 billion rows and I want to do this the efficient way 🙂

Best Answer

Here are two best practices for partitioning that pertain to the question:

Keep an empty staging partition at the leftmost and rightmost ends of the partition range to ensure that the partitions split when loading in new data, and merge, after unloading old data, do not cause data movement.
Do not split or merge a partition already populated with data because this can cause severe locking and explosive log growth.

http://www.informit.com/articles/article.aspx?p=1946159&seqNum=5

If the leftmost end of your partition is empty, use ALTER PARTITION FUNCTION SPLIT RANGE to add new ranges to the partition function.

To check if the leftmost partition is empty, use a query like the following:

DECLARE @PartitionFunctionName sysname = 'YourPartitionFunctionNameHere';

SELECT 
p.partition_number, SUM(pst.row_count) RowCountInPartition, pf.name PartitionFunction, ps.name PartitionScheme
FROM sys.dm_db_partition_stats pst
INNER JOIN sys.partitions p ON pst.partition_id = p.partition_id
INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id
INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id
INNER JOIN sys.partition_functions pf ON ps.function_id = pf.function_id
WHERE pf.name = @PartitionFunctionName
GROUP BY p.partition_number, pf.name, ps.name;

If the first partition is not empty, the best practices recommend that you create a new function with all values, create a new table on that function, then insert the data to the new table.

Also, if the left partition just has a few records, a split may be fine. Not sure on that as I've never tried it.

Whatever you do, make sure to leave some empty partitions at the leftmost and rightmost partition when you're finished. I might even go so far as to creating partition ranges for 0 and 1, then add a check constraint to prevent the first partition from getting data in it. Do the same thing for the end.

Related Solutions

Sql-server – How to take partitions or filegroups out of service

I can't give you an answer to question 1).
But to answer question 2):
What you are requesting is known as the "sliding window scenario".
If you want to eliminate a partition containing data that is no longer needed you have to perform following steps:

Create a table on the same filegroup as the partition you want to delete (typically, this table is referred to as a staging table).
This table has to have exact the same structure as the partitioned table, including indexes.
Switch the partition to the staging table
You have to remove the reference to the filegroup from the partition function vie MERGE RANGE ...
Now you can truncate the staging table and drop it. What is left is an empty filegroup you can remove, too.

Some more details on this at http://technet.microsoft.com/en-us/library/aa964122.aspx and the white paper on partitioning, available under http://msdn.microsoft.com/en-us/library/dd578580%28v=sql.100%29.aspx.

Sql-server – Adding Indexes To a Partitioned table in SQL Server 2012

In order for the table to be partitioned, your clustered index should be created as:-

ALTER TABLE [dbo].[Logs] ADD  CONSTRAINT [PK_Logs] PRIMARY KEY CLUSTERED 
(
    [logId] ASC,
    [time] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PS_dbo_Date_ByDay](Time)

You can then view the different partitions by running:-

SELECT 
    t.name AS TableName, i.name AS IndexName, p.partition_number, p.partition_id, 
    i.data_space_id, f.function_id, f.type_desc, r.boundary_id, r.value AS BoundaryValue, p.rows
FROM 
    sys.tables AS t
INNER JOIN
    sys.indexes AS i ON t.object_id = i.object_id
INNER JOIN
    sys.partitions AS p ON i.object_id = p.object_id AND i.index_id = p.index_id 
INNER JOIN
    sys.partition_schemes AS s ON i.data_space_id = s.data_space_id
INNER JOIN
    sys.partition_functions AS f ON s.function_id = f.function_id
LEFT OUTER JOIN 
    sys.partition_range_values AS r ON f.function_id = r.function_id AND r.boundary_id = p.partition_number
WHERE 
    t.name = 'Logs'
AND 
    i.type <= 1
ORDER BY p.partition_number;

The table is essentially carved up into different sections this will allow queries to reference the different sections without blocking each other (try inserting records into two different partitions at the same time, then try it on a non-partitioned table).

This will be the same for an ALIGNED index. Create it on the paritition scheme (the same as the clustered index) and it will also be carved up (the SELECT above will show you).

However you can create a NON-ALIGNED index, just don't add the ON PS_dbo_Date_ByDay, use PRIMARY or another filegroup.

Hope this helps.

EDIT:- Your primary key contains a column (LogID) that is not part of the partition. The table will be partitioned but there could be a performance impact. I asked a similar question on here a while ago, here's the link:-

Partitioning Query

Best Answer

Related Solutions

Sql-server – How to take partitions or filegroups out of service

Sql-server – Adding Indexes To a Partitioned table in SQL Server 2012

Related Question