The row-versioning framework introduced in SQL Server 2005 is used to support a number of features, including the new transaction isolation levels READ_COMMITTED_SNAPSHOT
and SNAPSHOT
. Even when neither of these isolation levels are enabled, row-versioning is still used for AFTER
triggers (to facilitate generation of the inserted
and deleted
pseudo-tables), MARS, and (in a separate version store) online indexing.
As documented, the engine may add a 14-byte postfix to each row of a table that is versioned for any of these purposes. This behaviour is relatively well-known, as is the addition of the 14-byte data to every row of an index that is rebuilt online with a row-versioning isolation level enabled. Even where the isolation levels are not enabled, one extra byte is added to non-clustered indexes only when rebuilt ONLINE
.
Where an AFTER trigger is present, and versioning would otherwise add 14 bytes per row, an optimization exists within the engine to avoid this, but where a ROW_OVERFLOW
or LOB
allocation cannot occur. In practice, this means the maximum possible size of a row must be less than 8060 bytes. In calculating maximum possible row sizes, the engine assumes for example that a VARCHAR(460) column could contain 460 characters.
The behaviour is easiest to see with an AFTER UPDATE
trigger, though the same principle applies to AFTER DELETE
. The following script creates a table with a maximum in-row length of 8060 bytes. The data fits on a single page, with 13 bytes of free space on that page. A no-op trigger exists, so the page is split and versioning information added:
USE Sandpit;
GO
CREATE TABLE dbo.Example
(
ID integer NOT NULL IDENTITY(1,1),
Value integer NOT NULL,
Padding1 char(42) NULL,
Padding2 varchar(8000) NULL,
CONSTRAINT PK_Example_ID
PRIMARY KEY CLUSTERED (ID)
);
GO
WITH
N1 AS (SELECT 1 AS n UNION ALL SELECT 1),
N2 AS (SELECT L.n FROM N1 AS L CROSS JOIN N1 AS R),
N3 AS (SELECT L.n FROM N2 AS L CROSS JOIN N2 AS R),
N4 AS (SELECT L.n FROM N3 AS L CROSS JOIN N3 AS R)
INSERT TOP (137) dbo.Example
(Value)
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT 0))
FROM N4;
GO
ALTER INDEX PK_Example_ID
ON dbo.Example
REBUILD WITH (FILLFACTOR = 100);
GO
SELECT
ddips.index_type_desc,
ddips.alloc_unit_type_desc,
ddips.index_level,
ddips.page_count,
ddips.record_count,
ddips.max_record_size_in_bytes
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'dbo.Example', N'U'), 1, 1, 'DETAILED') AS ddips
WHERE
ddips.index_level = 0;
GO
CREATE TRIGGER ExampleTrigger
ON dbo.Example
AFTER DELETE, UPDATE
AS RETURN;
GO
UPDATE dbo.Example
SET Value = -Value
WHERE ID = 1;
GO
SELECT
ddips.index_type_desc,
ddips.alloc_unit_type_desc,
ddips.index_level,
ddips.page_count,
ddips.record_count,
ddips.max_record_size_in_bytes
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'dbo.Example', N'U'), 1, 1, 'DETAILED') AS ddips
WHERE
ddips.index_level = 0;
GO
DROP TABLE dbo.Example;
The script produces the output shown below. The single-page table is split into two pages, and the maximum physical row length has increased from 57 to 71 bytes (= +14 bytes for the row-versioning information).
DBCC PAGE
shows that the single updated row has Record Attributes = NULL_BITMAP VERSIONING_INFO Record Size = 71
, whereas all other rows in the table have Record Attributes = NULL_BITMAP; record Size = 57
.
The same script, with the UPDATE
replaced by a single row DELETE
produces the output shown:
DELETE dbo.Example
WHERE ID = 1;
There is one fewer row in total (of course!), but the maximum physical row size has not increased. Row versioning information is only added to rows needed for the trigger pseudo-tables, and that row was ultimately deleted. The page split remains, however. This page-splitting activity is responsible for the slow performance observed when the trigger was present. If the definition of the Padding2
column is changed from varchar(8000)
to varchar(7999)
, the page no longer splits.
Also see this blog post by SQL Server MVP Dmitri Korotkevitch, which also discusses the impact on fragmentation.
Your process is sound. Putting the index on the date column will make it much faster for SQL Server to find the rows that it is looking for. Without the nonclustered index SQL Server will need to scan the production table every time you go to delete the rows. This means that SQL will need to load the entire table from disk each time the DELETE TOP 500 runs. Having (and using) the nonclustered index will be essential for getting this done quickly.
As for the memory setting, you'll want to set that to give SQL Server access to as much RAM as possible. You are correct, SQL Server will release RAM if other applications need it. The fact that you are seeing PAGEIOLATCH_EX and PAGEIOLATCH_SH waits tells me that you don't have enough RAM to keep the nonclustered index which you created in memory when combined with the other data which the system is using. Increasing the memory settings on the SQL Server will help this, provided that you have more memory in the server.
Those two wait types (PAGEIOLATCH_EX and PAGEIOLATCH_SH) tell us that you are waiting for the disk to respond to your request for more data to be loaded. The more data you can keep in memory the less data you'll need to read from the disk.
Best Answer
Even if you rebuild the indexes, SQL Server doesn't reduce the size of the data file(s) just because you've removed some data inside the file. The assumption is typically that if the database got that big once, it will get that big again, so why shrink the file only to grow it again later? Both shrink and growth operations are extremely disruptive to normal activity. Please read this thread in full to understand why you want to be very careful about freeing up space on your drive (even on a cloud-hosted server):
Also see: