Sql-server – Error Updating Record on Table with Partition on a Read-Only Filegroup

partitioningsql server

We are working to partition our data warehouse (SQL Server 2012 Enterprise): moving old records we don't expect to change to a read-only filegroup on a monthly basis. We're getting errors that records cannot be updated even when the record is in the read-write filegroup and would not move to the read-only partition (we are not updating the partitioning column) due to the update.

The partitioning column is ETL_VERS_END_DTM. The following update statement fails as if the row needed to be moved to partition 1 (values less than 1899-12-31 23:59:59 which is expected to be an empty partition).

UPDATE  DW_STAGE_PHOENIX.dbo.STG_PHX_LN_ACCT_INT_OPT
SET     ETL_JOB_SEQ_NUM = 1027140 ,
        ETL_IUD_CD = 'D' ,
        ETL_UPD_DTM = CONVERT(DATETIME, '03/16/2016 01:01:17')
WHERE   ETL_VERS_START_DTM = CONVERT(DATETIME, '12/27/2014 00:26:24.137')
        AND ACCT_NO = '5001194157'
        AND ACCT_TYPE = 'ILN';

Error:

Msg 652, Level 16, State 1, Line 1 The index
"PK_STG_PHX_LN_ACCT_INT_OPT" for table "dbo.STG_PHX_LN_ACCT_INT_OPT"
(RowsetId 72057594415022080) resides on a read-only filegroup
("ARCHIVE"), which cannot be modified.

Explicitly setting the partitioned column to its own existing value or putting anything in the where clause that gets us the correct partition elimination removes the condition that causes this error. Neither solution is considered a 'fix' for the issue by the vendor or our data warehouse admin. What is causing SQL Server to attempt to move the record if the column value is not changing?

UPDATE:
So my problem is me misunderstanding that if the execution plan thinks it might touch the read-only partition it will error out (even though no records have been touched). The problem ultimately is that the database was not built originally (or loaded) with a partitioning as a consideration as well as lack of desire to customize update processes by our DW team. Partitioning is being built in to manage maintenance times as the database continues to grow due to a business requirement to never delete anything ever. [Did that sound a little bitter?]

Best Answer

Ordinarily, SQL Server only checks for a read-only filegroup just before attempting to write to the partition. No error is thrown when reading from a read-only filegroup, as you would expect.

SQL Server can therefore touch any rows it likes, so long as only rows located on read-write filegroups qualify for the update. To put it another way: SQL Server can fully scan all partitions (e.g. using a Clustered Index Scan) looking for rows to update, so long as the only rows that arrive at the Clustered Index Update are on read-write partitions.

In this scenario, there are two separate storage engine rowsets involved: one for the read, and one for the update. The reading rowset will not throw an error if it encounters a read-only partition because it is only reading. The writing rowset will throw an error, because it is configured to perform changes.

The catch

In execution plan shapes where there is no blocking operator between reads and writes to the base table, SQL Server may apply an optimization ("Rowset Sharing"), whereby the two operators share a single storage engine rowset.

In this situation, reading from a read-only partition will throw the error reported in the question. The single rowset is configured for both read and write - touching a read-only partition results in the error.

Adding a key from the unique clustered index to the set clause means a split, sort, collapse combination is required to avoid transient key violations. As a side-effect, the sort (a blocking operator) prevents the rowset sharing optimization from being applied.

For more details, see my article Changes to a Writable Partition May Fail Unexpectedly

Related Solutions

Sql-server – Switching a partition on a table with one to many relation

You mentioned you added datetime to the PK, but it was not partitioned. You also need to specify the partition scheme when you recreate the PK index so that it is aligned.

Also, note there are restrictions on tables with foreign key relationships. See Transferring Data Efficiently by Using Partition Switching.

Unfortunately, SQL Server didn't ask anything about index creation on partition or filegroup. I performed that with visual features.

The GUI is nice for some simple quick tasks but not for partitioning a table with billions of rows. Use T-SQL so that you know exactly what is going on. Large tables are unforgiving.

Sql-server – How to implement table purge with partition on SQL Server 2012

The goal is to reduce the amount of transaction log by only updating metadata.

When data is move or remove from a partition, it Insert and Delete rows which will result in (tons of) LOB_INSERT_ROWS and LOB_DELETE_ROW in the transaction logs.

The only option would be to truncate the partition but this option does not exist. We can avoid it by only using Merge and Split on empty Partitions.

In the sample below, I will keep things shorter and only create data for the past 3 months (ie. August, September and October) but you can easily expand it to 9 months. August will be removed once data starts being added to November and so on with September and December...

Create Files and Filegroups:

I first create 6 files and file groups [Part_0] to [Part_5]:

Alter Database [Test] Add Filegroup [Part_0];
...
Alter Database [Test] Add Filegroup [Part_5];
Alter Database [Test] Add File( NAME = N'Part_0', FILENAME = N'...\Part_0.ndf' , SIZE = 100MB , FILEGROWTH = 100MB ) TO Filegroup [Part_0];
...
Alter Database [Test] Add File( NAME = N'Part_5', FILENAME = N'...\Part_5.ndf' , SIZE = 100MB , FILEGROWTH = 100MB ) TO Filegroup [Part_5];

Create Function and Scheme:

Create Partition Function [DateKeyPartFunction] (datetime2)
as Range Right For Values ('20150801', '20150901', '20151001', '20151101', '20151201');
Create Partition Scheme [DateKeyPartScheme] as Partition [DateKeyPartFunction]
To ([Part_0], [Part_1], [Part_2], [Part_3], [Part_4], [Part_5]);

Again there are 6 partitions. This will be explained later but this is mostly due to the need to have empty partitions.

Create Table and Clustered Index:

Because I don't know the exact design of your table, I will be working with this table:

Create Table dbo.DataPart(id int identity(0, 1), name char(1000), name_date datetime2);

and this Clustered Index:

Create Clustered Index IDX_Part On dbo.DataPart(name_date) On DateKeyPartScheme(name_date);

Dummy data:

This code creates over a million record in a range of dummy dates every 6 seconds from October to August(now):

With inc(n) as(
    Select ROW_NUMBER() over(order by (select 1))-1 From (
        Select 1 From (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x1(n)
        Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x2(n)
        Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x3(n)
        Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x4(n)
        Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x5(n)
        Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x6(n)
    ) as x(n)
)
Insert into dbo.DataPart(name, name_date) 
Select TOP(1000000) '', DATEADD(second, -n*6, getdate()) From inc;

Partitioned Data:

This will partionned as follow:

Id Partition    Left Bound      Right Bound     Row Count
1  [Part_0]                 < '20150801'    0
2  [Part_1] >= '20150801'   < '20150901'    184042
3  [Part_2] >= '20150901'   < '20151001'    432000
4  [Part_3] >= '20151001'   < '20151101'    383958
5  [Part_4] >= '20151101'   < '20151201'    0
6  [Part_5] >= '20151201'                   0

[Part_1], [Part_2], [Part_3] contain data for August, September and October.
[Part_0] (ie. <'20150801') only exists because it needs an empty partition before we can empty [Part_1] (August) once in November. It would be data for July and before but must remain empty.
It make things easier to already have an empty partition for next month although it will remain empty until it reach November. This is the purpose of [Part_4] for November.
Partition Scheme needs an extra partition for everything outside of the right boundary. This is [Part_5] for December and beyond and it must remain empty as well.

Add November:

Once in November, new rows will go to [Part_4] and August data can be removed from [Part_1]. The only way to remove it without having to delete hundred of thousands of rows is to move [Part_1] away from the table:

Create Table dbo.DataPart_Temp(id int identity(0, 1), name char(1000), name_date datetime2);
Create Clustered Index IDX_Part_temp On dbo.DataPart_temp(name_date) On [Part_1];
Alter Table DataPart Switch Partition 2 to DataPart_temp Partition 1;

DataPart_Temp must be identical to DataPart (columns, indexes)
Because [Part_1] is move from DataPart to DataPart_temp, the clustered index on DataPart_temp must be created on the same filegroup: [Part_1]
[Part_1] is the second partition of DataPart and is switch to the first and only Partition of DataPart_temp. All August rows are now in DataPart_temp.

the table is now partitioned as follow:

id Partition    Left Bound      Right Bound     Row Count
1  [Part_0]                 < '20150801'    0
2  [Part_1] >= '20150801'   < '20150901'    0
3  [Part_2] >= '20150901'   < '20151001'    432000
4  [Part_3] >= '20151001'   < '20151101'    383958
5  [Part_4] >= '20151101'   < '20151201'    0
6  [Part_5] >= '20151201'                   0

Merge Partition:

[Part_0] and [Part_1] are now empty and can be merge:

Alter Partition Function [DateKeyPartFunction]() Merge Range ('20150801');

[Part_1] has been removed:

id Partition    Left Bound      Right Bound     Row Count
1  [Part_0]                 < '20150901'    0
2  [Part_2] >= '20150901'   < '20151001'    432000
3  [Part_3] >= '20151001'   < '20151101'    383958
4  [Part_4] >= '20151101'   < '20151201'    0
5  [Part_5] >= '20151201'                   0

Add Next month:

Now that [Part_1] is not used anymore, it can be added to the Partition Scheme as the next available partition:

Alter Partition Scheme [DateKeyPartScheme] Next Used [Part_1];

And then [Part_5] (December, >= '20151201') can be splited:

Alter Partition Function [DateKeyPartFunction]() Split Range ('20160101');

Since [Part_5] is empty, nothing has to be move. The other half of [Part_5] will go to the next available partition which is [Part_1]:

Id Partition    Left Bound      Right Bound     Row Count
1  [Part_0]                 < '20150901'    0
2  [Part_2] >= '20150901'   < '20151001'    432000
3  [Part_3] >= '20151001'   < '20151101'    383958
4  [Part_4] >= '20151101'   < '20151201'    x rows in November
5  [Part_5] >= '20151201'   < '20160101'    0
6  [Part_1] >= '20160101'                   0

Clean Up:

DataPart_temp can now be truncated and then dropped (or at least drop its Clustered Index).

Add Following months:

In December, Partition ID 2 must be move away ([Part_2]), merged with partition 1 and then added back before January split.

In order to automate this process, you will have to look for the name of the filegroup for Partition 2:

And then create dynamic SQL to:

Add Clustered Index to DataPart_temp on [Part_X] (similar to DataPart)
Switch [Part_X] from DataPart to DataPart_temp
Merge Partition 1 and 2
Add [Part_X] to DataPart
Split the last partition
Truncate DataPart_temp
Drop DataPart_temp or its Clustered Index

If you use fn_dblog, you should see that transaction logs are minimal.

Best Answer

The catch

Related Solutions

Sql-server – Switching a partition on a table with one to many relation

Sql-server – How to implement table purge with partition on SQL Server 2012

Related Question