Sql-server – MS SQL Page split confusion

fill-factorindexpage-splitssql server

I am a bit confused about page splits in MS SQL and I'm looking for a definitive answer. There seem to be two versions of the story:

1 – Fillfactor only affects how full pages are at the time the index is created/rebuilt. Page splits are always 50/50

2 – Fillfactor also affects how pages are split. So if there is 70% fillfactor and a page overflows, it will split 70/30

Thanks a lot

Best Answer

From Books Online (emphasis mine):

FILLFACTOR =fillfactor Applies to: SQL Server 2008 through SQL Server 2016, SQL Database V12.

Specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during index creation or rebuild. fillfactor must be an integer value from 1 to 100. If fillfactor is 100, the Database Engine creates indexes with leaf pages filled to capacity.

The FILLFACTOR setting applies only when the index is created or rebuilt. The Database Engine does not dynamically keep the specified percentage of empty space in the pages. To view the fill factor setting, use the sys.indexes catalog view.

Fill factor applies to the leaf pages. PAD_INDEX will determine what happens to non-leaf pages. From the same BoL page:

PAD_INDEX = { ON | OFF }

ON The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.

OFF or fillfactor is not specified The intermediate-level pages are filled to near capacity

Page splitting can occur at all levels of the BTree, including the root level. Due to fan-out, however, the frequency at which intermediate pages split is (typically) an order of magnitude less than that at which leaves split. Also there are many fewer non-leaf pages than leaf pages. Consequently PAD_INDEX is less concerning than FILLFACTOR. Still worth thinking about, especially on randomly-inserted indexes.

During writes, pages will be filled up to 100%. Then they will split 50/50, or as close as possible to ensure each row is contained entirely on a single page. I believe the split is always 50/50. Subsequent writes may be skewed, however, so one of the new pages fills much quicker than the other.

Related Solutions

Sql-server – B-tree node split strategy in SQL Server for monotonically increasing value

If it is adding a row at the end of the index it will just allocate a new page for the row rather than split the current end page. Experimental evidence for this is below (uses the %%physloc%% function which requires SQL Server 2008). See also the discussion here.

CREATE TABLE T
(
id int identity(1,1) PRIMARY KEY,
filler char(1000)
)
GO

INSERT INTO T
DEFAULT VALUES
GO 7

GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%)
FROM T

GO

INSERT INTO T
DEFAULT VALUES

GO

SELECT sys.fn_PhysLocFormatter(%%physloc%%)
FROM T
GO

DROP TABLE T

Returns (Your results will vary)

(1:173:0) /*File:Page:Slot*/
(1:173:1)
(1:173:2)
(1:173:3)
(1:173:4)
(1:173:5)
(1:173:6)
(1:110:0) /*Final insert is on a new page*/

This does only appear to apply to leaf nodes though. This can be seen by running the below and adjusting the TOP value. For me 622/623 was the cut off point between requiring one and two first level pages (might vary if you have snapshot isolation enabled?). It does split the page in a balanced manner leading to wasted space at this level.

USE tempdb;

CREATE TABLE T2
(
id int identity(1,1) PRIMARY KEY CLUSTERED,
filler char(8000)
)

INSERT INTO T2(filler)
SELECT TOP 622 'A'
FROM master..spt_values v1,  master..spt_values v2

DECLARE @index_info  TABLE
(PageFID  VARCHAR(10), 
  PagePID VARCHAR(10),   
  IAMFID   tinyint, 
  IAMPID  int, 
  ObjectID  int,
  IndexID  tinyint,
  PartitionNumber tinyint,
  PartitionID bigint,
  iam_chain_type  varchar(30),    
  PageType  tinyint, 
  IndexLevel  tinyint,
  NextPageFID  tinyint,
  NextPagePID  int,
  PrevPageFID  tinyint,
  PrevPagePID int, 
  Primary Key (PageFID, PagePID));

INSERT INTO @index_info 
    EXEC ('DBCC IND ( tempdb, T2, -1)'  ); 

DECLARE @DynSQL nvarchar(max) = 'DBCC TRACEON (3604);'
SELECT @DynSQL = @DynSQL + '
DBCC PAGE(tempdb, ' + PageFID + ', ' + PagePID + ', 3); '
FROM @index_info     
WHERE IndexLevel = 1

SET @DynSQL = @DynSQL + '
DBCC TRACEOFF(3604); '

EXEC(@DynSQL)


DROP TABLE T2

Sql-server – index maintenance strategy if few insertion points relative to # of rows

The answer by ThomasStringer is very good and the usage of partitioning here would likely help overall performance and decrease your maintenance cost - however, it won't do anything for your page split scenario.

Can you verify that your FK's are trusted? This won't decrease or impact your page splits but it will help your query execution time.

select 'ALTER TABLE ' + s.name + '.' + OBJECT_NAME(o.object_id) + ' WITH CHECK CHECK     CONSTRAINT ' + i.name + '
GO'
FROM sys.foreign_keys i  INNER JOIN sys.objects o ON i.parent_object_id = o.object_id     INNER JOIN sys.schemas s ON o.schema_id = s.schema_id WHERE     i.is_not_trusted = 1 AND i.is_not_for_replication = 0
-- Foreign Key Check... Will build the statement below to CHECK (validate) the FK for     those FK's that are untrusted.
UNION
select 'ALTER TABLE ' +  s.name + '.' + OBJECT_NAME(o.object_id) + ' WITH CHECK CHECK     CONSTRAINT ' + i.name + '
GO'
from sys.check_constraints i join sys.objects o on i.parent_object_id = o.object_id     join sys.schemas s on o.schema_id = s.schema_id where i.is_not_trusted =     1 and i.is_not_for_replication = 0

-- Check Constraint Check... Will build the statement below to CHECK (validate) the       check constraints for those check constraints that are untrusted.

-- !!!! -- The output (in TEXT) will look similar to what you see below.  once you     have the output, put it into the query window and execute it.

Another idea you already hit on, using fill factor - if you use a low fill factor and combine that with either row or page level compression, your data density, per page will increase. Again, this won't help your page split problem, BUT it can decrease the quantity of page splits (due to having a higher data density per page).

Lastly, if you can, look at changing your index and lead with the log_time column instead of the foreign key column. This change, depending on how the log_time data comes in (hopefully it's more "in order" than the FK column). This could decrease your page splits significantly. If you pair this with both compression (ROW) and a good partition scheme, you might see some significant improvements.

Best Answer

Related Solutions

Sql-server – B-tree node split strategy in SQL Server for monotonically increasing value

Sql-server – index maintenance strategy if few insertion points relative to # of rows

Related Question