Sql-server – Splitting Existing Partitions in SQL 2014

partitioningsql serversql server 2014

I have a SQL Server table that contains over 20 billion rows, with 13 billion rows in the last partition.

The table is currently partitioned 6 ways by a "Time" field:

2011
2012
2013
2014
2015
2016/2017/2018

The 2016/2017/2018 partition needs to be split into their own partitions so there is a 2016, 2017, 2018 partition.

How can this be done and in the process avoiding a massive log file growth? The database is in Simple Recovery and currently each partition is in a separate file group on its own set of disks.

CREATE TABLE [dbo].[Log](

      [RefNo] [nchar](7) NOT NULL,

      [DevID] [nvarchar](7) NOT NULL,

      [Time] [datetime] NOT NULL,

      [summary] [float] NULL,

      [staging] [float] NULL,

      [position] [float] NULL,

CONSTRAINT [PK_Log] PRIMARY KEY CLUSTERED

(

      [RefNo] ASC,

      [DevID] ASC,

      [Time] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 96)

)

enter image description here

Best Answer

Below is an example script to create the new yearly partitions. You may need to tweak the script for your actual partition function name and new filegroup names. This method assumes the operation is performed in a maintenance window when there are no inserts into the table and data for 2016 and later isn't needed for queries.

The summary of the steps performed by the script are:

  • create new partition function and scheme like the originals
  • create new aligned staging table like the original
  • switch the partition containing data > 2016-01-01 into the staging table
  • add 3 new partitions to the original partition function/scheme
  • rebuild the staging table on the original partition with DROP_EXISTING=ON
  • switch the 3 new partitions from the staging table back into the original table

There are a couple of issues with your existing table partitioning you should be aware of. Because of the RANGE LEFT function, datetime values that are exactly midnight on the first of the year will be placed in the wrong partition. For example, a row with Time value '2012-01-01T00:00:00' will be in partition 1 along with year 2011 data instead of partition 2 with the rest of year 2012 data as desired. This might not be a concern unless you purge data by partition.

Be mindful that the last partition if a RANGE LEFT function can never be removed. The implication is that the filegroup of the last partition is permanently part of the partition schemes that use the function. In your case, filegroup Year6FileGroup will always be the last filegroup of the psYearlyPartition_Log scheme, although it can also be used for another partition, such as for year 2016 here.

The partitions after the script is run will be as below and it is expected the last partition will be empty (assuming you have no future year data). You should create the partition for next year before the start of the year to avoid splitting a non-empty partition or avoid jumping through these hoops again.

+------------------+----------------+----------------+----------------+
| Partition Number | Lower Boundary | Upper Boundary |   Filegroup    |
+------------------+----------------+----------------+----------------+
|                1 | NULL           | <= 2012-01-01  | Year1FileGroup |
|                2 | > 2012-01-01   | <= 2013-01-01  | Year2FileGroup |
|                3 | > 2013-01-01   | <= 2014-01-01  | Year3FileGroup |
|                4 | > 2014-01-01   | <= 2015-01-01  | Year4FileGroup |
|                5 | > 2015-01-01   | <= 2016-01-01  | Year5FileGroup |
|                6 | > 2016-01-01   | <= 2017-01-01  | Year6FileGroup |
|                7 | > 2017-01-01   | <= 2018-01-01  | Year7FileGroup |
|                8 | > 2018-01-01   | <= 2019-01-01  | Year8FileGroup |
|                9 | > 2019-01-01   | NULL           | Year6FileGroup |
+------------------+----------------+----------------+----------------+

I generally recommend a RANGE RIGHT function when partitioning on incremental values like datetime because the behavior is more intuitive. See Table Partitioning Best Practices for this and other considerations.

--create identical partition function and scheme
CREATE PARTITION FUNCTION pfYearlyPartition_Log_Staging(datetime) 
    AS RANGE LEFT FOR VALUES(
      '20120101'
    , '20130101'
    , '20140101'
    , '20150101'
    , '20160101'
);
CREATE PARTITION SCHEME psYearlyPartition_Log_Staging
    AS PARTITION pfYearlyPartition_Log_Staging TO (
      Year1FileGroup
    , Year2FileGroup
    , Year3FileGroup
    , Year4FileGroup
    , Year5FileGroup
    , Year6FileGroup
);

--create identical aligned staging table
CREATE TABLE [dbo].[Log_Staging](
      [RefNo] [nchar](7) NOT NULL,
      [DevID] [nvarchar](7) NOT NULL,
      [Time] [datetime] NOT NULL,
      [summary] [float] NULL,
      [staging] [float] NULL,
      [position] [float] NULL,
CONSTRAINT [PK_Log_Staging] PRIMARY KEY CLUSTERED
(
      [RefNo] ASC,
      [DevID] ASC,
      [Time] ASC
) 
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 96)
) ON psYearlyPartition_Log_Staging(Time);

--switch last partition into staging table
ALTER TABLE dbo.Log
    SWITCH PARTITION $PARTITION.pfYearlyPartition_Log('20170101')
    TO dbo.Log_Staging PARTITION $PARTITION.pfYearlyPartition_Log_Staging('20170101');

--create new partitions for main table (assuming filegroups already exist for years 7 and 8)
ALTER PARTITION SCHEME psYearlyPartition_Log
    NEXT USED Year6FileGroup;
ALTER PARTITION FUNCTION pfYearlyPartition_Log()
    SPLIT RANGE('20170101');
ALTER PARTITION SCHEME psYearlyPartition_Log
    NEXT USED Year7FileGroup;
ALTER PARTITION FUNCTION pfYearlyPartition_Log()
    SPLIT RANGE('20180101');
ALTER PARTITION SCHEME psYearlyPartition_Log
    NEXT USED Year8FileGroup;
ALTER PARTITION FUNCTION pfYearlyPartition_Log()
    SPLIT RANGE('20190101');

--we must specify IGNORE_DUP_KEY=OFF in order to use DROP_EXISTING_ON here
--but, since the original table PK index definition is unchanged, that's a non-issue
CREATE UNIQUE CLUSTERED INDEX [PK_Log_Staging] ON dbo.Log_Staging 
(
      [RefNo] ASC,
      [DevID] ASC,
      [Time] ASC
) 
WITH (DROP_EXISTING=ON, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 96)
ON psYearlyPartition_Log(Time);

ALTER TABLE dbo.Log_Staging
    SWITCH PARTITION $PARTITION.pfYearlyPartition_Log('20170101')
    TO dbo.Log PARTITION $PARTITION.pfYearlyPartition_Log('20170101');
ALTER TABLE dbo.Log_Staging
    SWITCH PARTITION $PARTITION.pfYearlyPartition_Log('20180101')
    TO dbo.Log PARTITION $PARTITION.pfYearlyPartition_Log('20180101');
ALTER TABLE dbo.Log_Staging
    SWITCH PARTITION $PARTITION.pfYearlyPartition_Log('20190101')
    TO dbo.Log PARTITION $PARTITION.pfYearlyPartition_Log('20190101');