Sql-server – How to reduce table partition timing for existing table in SQL Server

partitioningsql-server-2008

Repost from: https://stackoverflow.com/questions/48169487/how-to-reduce-table-partition-timing-for-existing-table-in-sql-server

I have a problem with table partition, I am working in a very big database, It is around 2 terabyte. So we planned to split the large tables into different partitions at last year. On that time we were divided a table into two partitons. One partition was located in primary file group and another one was located in secondary file group named as PartitionFG1. So we got lots of benefits from this activity like index maintenance and performance.

And then we have planned to create another one partition for the same table for this upcoming year records. So we have created a new secondary file group and we have altered the partition scheme to use new secondary file group named as PartitionFG2. So we have alter the partition function to split the range, here is the problem. Here the new range value split and moves the data into new partition but not into the new file group. For example table’s max PK value is 1000, so I used split range value as 1000. So it should not be taken any time to move data to new partition, because new partition will contains 0 records only.

Now I am going describe the scenario by sequence.

Table Creation

CREATE TABLE Tbl_3rdParatition(PK NUMERIC(18,0) identity(1,1),Line Varchar(100))

CREATE CLUSTERED INDEX CidxTbl_3rdParatition ON Tbl_3rdParatition (pk ASC)

Insert record into table

DECLARE @I INT DECLARE @CNT INT SET @I=1 SET @CNT = 1000

WHILE (@I< = @CNT)
BEGIN
  INSERT INTO Tbl_3rdParatition (Line) VALUES ('Primary')
  SET @I=@I+1
END

Now we have inserted 1000 records into the table. We can verify the table’s row and partition by below query.

SELECT p.partition_number, fg.name, p.rows 
FROM sys.partitions p 
INNER JOIN sys.allocation_units au ON au.container_id = p.hobt_id 
INNER JOIN sys.filegroups fg ON fg.data_space_id = au.data_space_id 
WHERE p.object_id = OBJECT_ID('Tbl_3rdParatition')

Record stored in Primary file group

So it ensures primary file group contains all records of the table. Now I am going to split the table into 2 partitions.

CREATE PARTITION FUNCTION PF_Tbl_3rdParatition(NUMERIC(18,0)) AS RANGE LEFT 
FOR VALUES(500);

CREATE PARTITION SCHEME PS_Tbl_3rdParatition AS PARTITION PF_Tbl_3rdParatition TO ([PRIMARY],[SWPPartitionFG1])

CREATE UNIQUE CLUSTERED INDEX CidxTbl_3rdParatition ON dbo.Tbl_3rdParatition(PK) WITH(DROP_EXISTING = ON)ON PS_Tbl_3rdParatition(PK) ;

Partition1 stored in SWPPartitionFG1

Now it is divided into 2 partitions. It took some time to move 500 records to SWPPartitionFG1.

Now I am going to create a new partition on another file group.

ALTER PARTITION SCHEME PS_Tbl_3rdParatition 
NEXT USED [SWPPartitionFG2]

ALTER PARTITION FUNCTION PF_Tbl_3rdParatition() 
SPLIT RANGE (1000) 

ALTER INDEX [CidxTbl_3rdParatition] ON [dbo].[Tbl_3rdParatition ] REBUILD  WITH (SORT_IN_TEMPDB = OFF, ONLINE = ON, MAXDOP = 1)

As of above query, 3rd partition should have 0 records. That is correct. But 3rd partition should be stored in SWPPartitionFG2 correct? But second partition’s data fully moved into SWPPartitionFG2. And the 3rd partition is allocated in SWPPartitionFG1 it is wrong!. So it takes too much of time to transfer the data from FG1 to FG2.

3rd Partition stored in SWPPartitionFG1

2nd partition moved to SWPPartitionFG2

I desired to store the 3rd partition data in third file group (SWPPartitionFG2). Then only it will not take large time in partitioning process. For this reason we want lots of time to create a new partition. Our client will not give that much of down time for us.

Our Actual table size is 300 GB. 2nd partitions SWPPartitionFG1 File group contains 200 GB of data. So it requires 2:30 hrs time to move the data From FG1 to FG2. Please help me reduce the time in this activity.

Best Answer

ALTER PARTITION SCHEME PS_Tbl_3rdParatition NEXT USED [SWPPartitionFG2]

ALTER PARTITION FUNCTION PF_Tbl_3rdParatition() SPLIT RANGE (1000)

ALTER INDEX [CidxTbl_3rdParatition] ON [dbo].[Tbl_3rdParatition ] REBUILD WITH (SORT_IN_TEMPDB = OFF, ONLINE = ON, MAXDOP = 1)

Why are you rebuilding the clustered index after creating the new partition?

When you SPLIT a RANGE LEFT function, the partition containing the new boundary is split. The new partition on SWPPartitionFG2 is created to the left of the SWPPartitionFG1 partition. The new partition on SWPPartitionFG2 becomes partition 2 and the original SWPPartitionFG1 partition is renumbered as partition 3. Existing rows in the spilt partition that are less than or equal to the new boundary are moved into the new partition during the process. This is a very expensive operation, requiring about 4 times the logging of normal DML. That is why one should plan to split only empty partitions, or at least ensure no rows must be moved.

It would be best to use a RANGE RIGHT function here so that incremental boundaries are created to the right of the SPLIT partition. Create a new RANGE RIGHT partition function and scheme and rebuild indexes specifying the new scheme with the 'DROP_EXISTING = ON` option. I would expect this to take much less time than the current operation and, going forward, the SPLIT to create incremental partition boundaries will be a fast meta-data only operation.

CREATE PARTITION FUNCTION PF_Tbl_3rdParatition_Right(NUMERIC(18,0)) AS RANGE LEFT 
FOR VALUES(500,1000);

CREATE PARTITION SCHEME PS_Tbl_3rdParatition_Right AS 
    PARTITION PF_Tbl_3rdParatition_Right TO ([PRIMARY],[SWPPartitionFG1],[SWPPartitionFG2]);

CREATE UNIQUE CLUSTERED INDEX CidxTbl_3rdParatition ON dbo.Tbl_3rdParatition(PK)
WITH(DROP_EXISTING = ON)
ON PS_Tbl_3rdParatition_Right(PK);

See Table Partitioning Best Practices for more information.