The goal is to reduce the amount of transaction log by only updating metadata.
When data is move or remove from a partition, it Insert and Delete rows which will result in (tons of) LOB_INSERT_ROWS
and LOB_DELETE_ROW
in the transaction logs.
The only option would be to truncate the partition but this option does not exist.
We can avoid it by only using Merge
and Split
on empty Partitions.
In the sample below, I will keep things shorter and only create data for the past 3 months (ie. August, September and October) but you can easily expand it to 9 months. August will be removed once data starts being added to November and so on with September and December...
Create Files and Filegroups:
I first create 6 files and file groups [Part_0] to [Part_5]:
Alter Database [Test] Add Filegroup [Part_0];
...
Alter Database [Test] Add Filegroup [Part_5];
Alter Database [Test] Add File( NAME = N'Part_0', FILENAME = N'...\Part_0.ndf' , SIZE = 100MB , FILEGROWTH = 100MB ) TO Filegroup [Part_0];
...
Alter Database [Test] Add File( NAME = N'Part_5', FILENAME = N'...\Part_5.ndf' , SIZE = 100MB , FILEGROWTH = 100MB ) TO Filegroup [Part_5];
Create Function and Scheme:
Create Partition Function [DateKeyPartFunction] (datetime2)
as Range Right For Values ('20150801', '20150901', '20151001', '20151101', '20151201');
Create Partition Scheme [DateKeyPartScheme] as Partition [DateKeyPartFunction]
To ([Part_0], [Part_1], [Part_2], [Part_3], [Part_4], [Part_5]);
Again there are 6 partitions. This will be explained later but this is mostly due to the need to have empty partitions.
Create Table and Clustered Index:
Because I don't know the exact design of your table, I will be working with this table:
Create Table dbo.DataPart(id int identity(0, 1), name char(1000), name_date datetime2);
and this Clustered Index
:
Create Clustered Index IDX_Part On dbo.DataPart(name_date) On DateKeyPartScheme(name_date);
Dummy data:
This code creates over a million record in a range of dummy dates every 6 seconds from October to August(now):
With inc(n) as(
Select ROW_NUMBER() over(order by (select 1))-1 From (
Select 1 From (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x1(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x2(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x3(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x4(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x5(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x6(n)
) as x(n)
)
Insert into dbo.DataPart(name, name_date)
Select TOP(1000000) '', DATEADD(second, -n*6, getdate()) From inc;
Partitioned Data:
This will partionned as follow:
Id Partition Left Bound Right Bound Row Count
1 [Part_0] < '20150801' 0
2 [Part_1] >= '20150801' < '20150901' 184042
3 [Part_2] >= '20150901' < '20151001' 432000
4 [Part_3] >= '20151001' < '20151101' 383958
5 [Part_4] >= '20151101' < '20151201' 0
6 [Part_5] >= '20151201' 0
- [Part_1], [Part_2], [Part_3] contain data for August, September and October.
- [Part_0] (ie. <'20150801') only exists because it needs an empty partition before we can empty [Part_1] (August) once in November. It would be data for July and before but must remain empty.
- It make things easier to already have an empty partition for next month although it will remain empty until it reach November. This is the purpose of [Part_4] for November.
- Partition Scheme needs an extra partition for everything outside of the right boundary. This is [Part_5] for December and beyond and it must remain empty as well.
Add November:
Once in November, new rows will go to [Part_4] and August data can be removed from [Part_1].
The only way to remove it without having to delete hundred of thousands of rows is to move [Part_1] away from the table:
Create Table dbo.DataPart_Temp(id int identity(0, 1), name char(1000), name_date datetime2);
Create Clustered Index IDX_Part_temp On dbo.DataPart_temp(name_date) On [Part_1];
Alter Table DataPart Switch Partition 2 to DataPart_temp Partition 1;
DataPart_Temp
must be identical to DataPart
(columns, indexes)
- Because [Part_1] is move from
DataPart
to DataPart_temp
, the clustered index on DataPart_temp
must be created on the same filegroup: [Part_1]
- [Part_1] is the second partition of
DataPart
and is switch to the first and only Partition of DataPart_temp
. All August rows are now in DataPart_temp
.
the table is now partitioned as follow:
id Partition Left Bound Right Bound Row Count
1 [Part_0] < '20150801' 0
2 [Part_1] >= '20150801' < '20150901' 0
3 [Part_2] >= '20150901' < '20151001' 432000
4 [Part_3] >= '20151001' < '20151101' 383958
5 [Part_4] >= '20151101' < '20151201' 0
6 [Part_5] >= '20151201' 0
Merge Partition:
[Part_0] and [Part_1] are now empty and can be merge:
Alter Partition Function [DateKeyPartFunction]() Merge Range ('20150801');
[Part_1] has been removed:
id Partition Left Bound Right Bound Row Count
1 [Part_0] < '20150901' 0
2 [Part_2] >= '20150901' < '20151001' 432000
3 [Part_3] >= '20151001' < '20151101' 383958
4 [Part_4] >= '20151101' < '20151201' 0
5 [Part_5] >= '20151201' 0
Add Next month:
Now that [Part_1] is not used anymore, it can be added to the Partition Scheme as the next available partition:
Alter Partition Scheme [DateKeyPartScheme] Next Used [Part_1];
And then [Part_5] (December, >= '20151201') can be splited:
Alter Partition Function [DateKeyPartFunction]() Split Range ('20160101');
Since [Part_5] is empty, nothing has to be move. The other half of [Part_5] will go to the next available partition which is [Part_1]:
Id Partition Left Bound Right Bound Row Count
1 [Part_0] < '20150901' 0
2 [Part_2] >= '20150901' < '20151001' 432000
3 [Part_3] >= '20151001' < '20151101' 383958
4 [Part_4] >= '20151101' < '20151201' x rows in November
5 [Part_5] >= '20151201' < '20160101' 0
6 [Part_1] >= '20160101' 0
Clean Up:
DataPart_temp
can now be truncated and then dropped (or at least drop its Clustered Index).
Add Following months:
In December, Partition ID 2 must be move away ([Part_2]), merged with partition 1 and then added back before January split.
In order to automate this process, you will have to look for the name of the filegroup for Partition 2:
And then create dynamic SQL to:
- Add Clustered Index to DataPart_temp on [Part_X] (similar to DataPart)
- Switch [Part_X] from DataPart to DataPart_temp
- Merge Partition 1 and 2
- Add [Part_X] to DataPart
- Split the last partition
- Truncate DataPart_temp
- Drop DataPart_temp or its Clustered Index
If you use fn_dblog, you should see that transaction logs are minimal.
Best Answer
Ordinarily, SQL Server only checks for a read-only filegroup just before attempting to write to the partition. No error is thrown when reading from a read-only filegroup, as you would expect.
SQL Server can therefore touch any rows it likes, so long as only rows located on read-write filegroups qualify for the update. To put it another way: SQL Server can fully scan all partitions (e.g. using a Clustered Index Scan) looking for rows to update, so long as the only rows that arrive at the Clustered Index Update are on read-write partitions.
In this scenario, there are two separate storage engine rowsets involved: one for the read, and one for the update. The reading rowset will not throw an error if it encounters a read-only partition because it is only reading. The writing rowset will throw an error, because it is configured to perform changes.
The catch
In execution plan shapes where there is no blocking operator between reads and writes to the base table, SQL Server may apply an optimization ("Rowset Sharing"), whereby the two operators share a single storage engine rowset.
In this situation, reading from a read-only partition will throw the error reported in the question. The single rowset is configured for both read and write - touching a read-only partition results in the error.
Adding a key from the unique clustered index to the set clause means a split, sort, collapse combination is required to avoid transient key violations. As a side-effect, the sort (a blocking operator) prevents the rowset sharing optimization from being applied.
For more details, see my article Changes to a Writable Partition May Fail Unexpectedly