If it is adding a row at the end of the index it will just allocate a new page for the row rather than split the current end page. Experimental evidence for this is below (uses the %%physloc%%
function which requires SQL Server 2008). See also the discussion here.
CREATE TABLE T
(
id int identity(1,1) PRIMARY KEY,
filler char(1000)
)
GO
INSERT INTO T
DEFAULT VALUES
GO 7
GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%)
FROM T
GO
INSERT INTO T
DEFAULT VALUES
GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%)
FROM T
GO
DROP TABLE T
Returns (Your results will vary)
(1:173:0) /*File:Page:Slot*/
(1:173:1)
(1:173:2)
(1:173:3)
(1:173:4)
(1:173:5)
(1:173:6)
(1:110:0) /*Final insert is on a new page*/
This does only appear to apply to leaf nodes though. This can be seen by running the below and adjusting the TOP
value. For me 622/623
was the cut off point between requiring one and two first level pages (might vary if you have snapshot isolation enabled?). It does split the page in a balanced manner leading to wasted space at this level.
USE tempdb;
CREATE TABLE T2
(
id int identity(1,1) PRIMARY KEY CLUSTERED,
filler char(8000)
)
INSERT INTO T2(filler)
SELECT TOP 622 'A'
FROM master..spt_values v1, master..spt_values v2
DECLARE @index_info TABLE
(PageFID VARCHAR(10),
PagePID VARCHAR(10),
IAMFID tinyint,
IAMPID int,
ObjectID int,
IndexID tinyint,
PartitionNumber tinyint,
PartitionID bigint,
iam_chain_type varchar(30),
PageType tinyint,
IndexLevel tinyint,
NextPageFID tinyint,
NextPagePID int,
PrevPageFID tinyint,
PrevPagePID int,
Primary Key (PageFID, PagePID));
INSERT INTO @index_info
EXEC ('DBCC IND ( tempdb, T2, -1)' );
DECLARE @DynSQL nvarchar(max) = 'DBCC TRACEON (3604);'
SELECT @DynSQL = @DynSQL + '
DBCC PAGE(tempdb, ' + PageFID + ', ' + PagePID + ', 3); '
FROM @index_info
WHERE IndexLevel = 1
SET @DynSQL = @DynSQL + '
DBCC TRACEOFF(3604); '
EXEC(@DynSQL)
DROP TABLE T2
The answer by ThomasStringer is very good and the usage of partitioning here would likely help overall performance and decrease your maintenance cost - however, it won't do anything for your page split scenario.
Can you verify that your FK's are trusted? This won't decrease or impact your page splits but it will help your query execution time.
select 'ALTER TABLE ' + s.name + '.' + OBJECT_NAME(o.object_id) + ' WITH CHECK CHECK CONSTRAINT ' + i.name + '
GO'
FROM sys.foreign_keys i INNER JOIN sys.objects o ON i.parent_object_id = o.object_id INNER JOIN sys.schemas s ON o.schema_id = s.schema_id WHERE i.is_not_trusted = 1 AND i.is_not_for_replication = 0
-- Foreign Key Check... Will build the statement below to CHECK (validate) the FK for those FK's that are untrusted.
UNION
select 'ALTER TABLE ' + s.name + '.' + OBJECT_NAME(o.object_id) + ' WITH CHECK CHECK CONSTRAINT ' + i.name + '
GO'
from sys.check_constraints i join sys.objects o on i.parent_object_id = o.object_id join sys.schemas s on o.schema_id = s.schema_id where i.is_not_trusted = 1 and i.is_not_for_replication = 0
-- Check Constraint Check... Will build the statement below to CHECK (validate) the check constraints for those check constraints that are untrusted.
-- !!!! -- The output (in TEXT) will look similar to what you see below. once you have the output, put it into the query window and execute it.
Another idea you already hit on, using fill factor - if you use a low fill factor and combine that with either row or page level compression, your data density, per page will increase. Again, this won't help your page split problem, BUT it can decrease the quantity of page splits (due to having a higher data density per page).
Lastly, if you can, look at changing your index and lead with the log_time column instead of the foreign key column. This change, depending on how the log_time data comes in (hopefully it's more "in order" than the FK column). This could decrease your page splits significantly. If you pair this with both compression (ROW) and a good partition scheme, you might see some significant improvements.
Best Answer
From Books Online (emphasis mine):
Fill factor applies to the leaf pages. PAD_INDEX will determine what happens to non-leaf pages. From the same BoL page:
Page splitting can occur at all levels of the BTree, including the root level. Due to fan-out, however, the frequency at which intermediate pages split is (typically) an order of magnitude less than that at which leaves split. Also there are many fewer non-leaf pages than leaf pages. Consequently PAD_INDEX is less concerning than FILLFACTOR. Still worth thinking about, especially on randomly-inserted indexes.
During writes, pages will be filled up to 100%. Then they will split 50/50, or as close as possible to ensure each row is contained entirely on a single page. I believe the split is always 50/50. Subsequent writes may be skewed, however, so one of the new pages fills much quicker than the other.