SQL Server – Will Re-Seeding an Identity Column Back to 0 Cause Page Splits?

clustered-indexperformancesql serversql server 2014

Simple table with an identity table as a clustered primary key. We unexpectedly hit the max value for the column, so I requested our DBA re-seed the table back to 0 to give us some time to examine updating the column. The table is also purged based on time, so we have a substantial gap in Id's to use up before we collide with anything. Since this is a high insert table, the DBA brought up the potential of page splits since we are no longer inserting at the end of the table.

My question is, how will SQL Server behave when the value of the identity column wraps around? Will inserting into gaps cause a performance issue? I want to consider maybe creating a sequence that will automatically cycle rather that up the id column to bigint.

Best Answer

My question is, how will SQL Server behave when the value of the identity column wraps around?

Assuming the index is 100% full the first few inserts at the new key value will cause one or two page splits, where half the rows on the full page are copied to the new page, and then the new row will be inserted into one of the half-full pages.

However after that you resume low-cost inserts and page allocations. When the page fills up a new, empty page is allocated and inserted into the doubly-linked list of leaf pages. No rows are copied to the new page, and the new row is written to the new page.

The only difference between this and an end-of-index insert is the presence of a next page after the new page. And the next page must be updated with a reverse page pointer to the new page.

See this post for details: Good Page Splits and Sequential GUID Key Generation

Will inserting into gaps cause a performance issue?

No.

As SQL Continues inserting into the "middle" of the index, when the target page for an insert is full, SQL will always allocate a new page. It won't put the row on the next existing page, even if that page is not full. The non-leaf levels of the clustered index store the starting value for each leaf-level page, so SQL can't insert the new row onto the next existing page without rewriting the non-leaf level of the index.

You can see this behavior easily by forcing page locks and looking at the lock footprint of your inserts (or less easily with DBCC PAGE). In the example below a clustered index table that can fit exactly 4 rows on a page take a series of mid-index inserts.

Assume every page can hold 4 rows. You have full page existing rows with ascending key values from 10-20, and now you want to insert key values 1-9.

use master
drop database ps_test
go
create database ps_test
go
use ps_test 
go
drop table if exists ps_test
set nocount on


go

--8,060=4*(4+2011)
create table ps_test(id int primary key, data char(2011) not null default '')

go
DBCC TRACEON(3604) 
go

insert into ps_test(id) values (10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21)--,22

go

dbcc ind('ps_test','ps_test',1)
--three data pages
/*

PageFID PagePID         PageType
------- ------- / /     --------
1       336             10      
1       328             1       
1       329             2       
1       330             1       
1       331             1       
*/


insert into ps_test(id) values (1)

dbcc ind('ps_test','ps_test',1)
--four data pages 332 has been inserted between 328 and 330
/*

PageFID PagePID        PageType PrevPagePID NextPagePID 
------- ---------/ /   -------- ----------- ----------- 
1       336            10       0           0           
1       328            1        0           332         
1       329            2        0           0           
1       330            1        332         331         
1       331            1        330         0           
1       332            1        328         330         
*/

dbcc page(ps_test,1,328,3)
-- 328 has rows 1,10 and 4048 bytes free
-- the page is split
/*
PAGE: (1:328)
...
pminlen = 2019                      m_slotCnt = 2                       m_freeCnt = 4048
...
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 1                              
...
Slot 1 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 10                             

DBCC execution completed. If DBCC printed error messages, contact your system administrator.
*/

dbcc page(ps_test,1,332,3)
--page 332 has 11,12,13 and 2024 bytes free
/*

PAGE: (1:332)
...
pminlen = 2019                      m_slotCnt = 3                       m_freeCnt = 2024
...
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 11                             
...
Slot 1 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 12                             
...
Slot 2 Column 1 Offset 0x4 Length 4 Length (physical) 4
id = 13                             
...
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
*/

insert into ps_test(id) values (2)
insert into ps_test(id) values (3)

dbcc ind('ps_test','ps_test',1)
--no page split.  
/*

PageFID PagePID     PageType IndexLevel NextPageFID NextPagePID PrevPageFID PrevPagePID
------- ----------- -------- ---------- ----------- ----------- ----------- -----------
1       336         10       NULL       0           0           0           0
1       328         1        0          1           332         0           0
1       329         2        1          0           0           0           0
1       330         1        0          1           331         1           332
1       331         1        0          0           0           1           330
1       332         1        0          1           330         1           328

the new rows went on page 328, which now has

pminlen = 2019                      m_slotCnt = 4                       m_freeCnt = 2024
*/

So now we have page 328 is full and the next page 332 has only 3 rows. But the lowest value on 332 is 11. So when we insert the value 4, does go on 332, which will cause another bad page split when 5 is inserted?

insert into ps_test(id) values (4)
dbcc ind('ps_test','ps_test',1)

/*
A new page 333 has been inserted in the list between 328 and 332

PageFID PagePID     PageType IndexLevel NextPageFID NextPagePID PrevPageFID PrevPagePID
------- ----------- -------- ---------- ----------- ----------- ----------- -----------
1       336         10       NULL       0           0           0           0
1       328         1        0          1           333         0           0
1       329         2        1          0           0           0           0
1       330         1        0          1           331         1           332
1       331         1        0          0           0           1           330
1       332         1        0          1           330         1           333
1       333         1        0          1           332         1           328

*/

dbcc page(ps_test,1,333,3)

/*

PAGE: (1:333)
...
pminlen = 2019                      m_slotCnt = 1                       m_freeCnt = 6072
...
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 10                             
...
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

*/

So another "bad page split", but this one only moved the row with id=10. And leaves page 328 full with values 1,2,3,4

dbcc page(ps_test,1,328,3)
/*
PAGE: (1:328)
...
pminlen = 2019                      m_slotCnt = 4                       m_freeCnt = 0
...
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 1  
...
Slot 1 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 2 
...
Slot 2 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 3  
...
Slot 3 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 4                              
...                        
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

*/

Now all the larger key values are on seperate pages, and from here on out there are no more bad page splits. The next insert requires a new page, and only the new row is on the new page:

    insert into ps_test(id) values (5)
    dbcc ind('ps_test','ps_test',1)

/*

PageFID PagePID      PageType IndexLevel NextPageFID NextPagePID PrevPageFID PrevPagePID
------- -----------  -------- ---------- ----------- ----------- ----------- -----------
1       336          10       NULL       0           0           0           0
1       328          1        0          1           334         0           0
1       329          2        1          0           0           0           0
1       330          1        0          1           331         1           332
1       331          1        0          0           0           1           330
1       332          1        0          1           330         1           333
1       333          1        0          1           332         1           334
1       334          1        0          1           333         1           328

*/

The new page 334, has only the last-inserted row.

 dbcc page(ps_test,1,334,3)
/*
PAGE: (1:334)
...
pminlen = 2019                      m_slotCnt = 1                       m_freeCnt = 6072
...
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

id = 5     
...
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
id = 5                              


*/

So after a couple of initial "bad page splits" that required moving rows, subsequently new pages are allocated and spliced into the middle of doubly-linked list of leaf pages. The new pages are allocated when a new row wont fit on the previous page and no rows are migrated to the new page.

Best Answer

Related Solutions

Sql-server – Large Fact table and partitioning key dilemma

SQL Server – Identity Value Jumps When Restarting

Related Question