Sql-server – Delete rows older than x days without locking table

deletepartitioningsql servertable

we have a quite big MS SQL database with millions of rows.
I created a simple script to delete rows older than 1 month, but this seems to lock the table and creates trouble for the application.

The table has an indexed "ID" PK, and also a "date" column which I will use for this task.

What is the best way to do this without causing the lock?
I was considering partitioning, but not sure if its the best way to go.
Thanks in advance.

Best Answer

You need to delete rows in batches to avoid lock escalation. Lock escalation happens when SQL Server switches from row or page locks to locking the entire table. Lock escalation conserves memory when SQL Server detects a large number of row or page locks have been taken, and more are needed to complete the operation. This may be why you're having blocking issues. If I recall correctly, row locks escalate to table locks when more than 5,000 rows are involved.

If the ID column is also the clustered index you might benefit from using the ID column to perform the delete instead of using the date column. To do that, obtain the ID value for the date you want to work with, then loop to delete fewer than 5,000 rows per batch, where the ID value is less than or equal to the desired ID. This is only applicable assuming rows are inserted in date order, such that incrementing ID values match incrementing date values.

You can determine how many rows were affected by the DELETE by checking the @@rowcount variable, and you can set a maximum number of rows to be affected using SET ROWCOUNT, then loop until your DELETE no longer removes rows:

SET ROWCOUNT 4500;
DECLARE @i int;
SELECT @i = 1;
    
WHILE @i > 0
BEGIN
    Delete from dbo.my_table Where ID < 4356;
    SELECT @i = @@rowcount;
END

An alternate method that eliminates the use of the deprecated SET ROWCOUNT statement is to add a TOP clause to the DELETE query, as in:

DECLARE @i int = 1;
DECLARE @id int;
SET @id = COALESCE((
    SELECT TOP(1) mt.ID
    FROM dbo.my_table mt
    WHERE mt.date < '2021-01-04T00:00:00.000'
    ORDER BY mt.date DESC
    ), 0);
WHILE @i > 0
BEGIN
    DELETE TOP(5000) mt
    FROM dbo.my_table mt
    WHERE mt.ID <= @ID;
    SET @i = @@ROWCOUNT;
END

In the code above, I've added a query to obtain the maximum ID value for rows on or before January 4th, 2021, simply to illustrate how this works.

Both methods above result in the same outcome, all rows prior to the given @ID value are deleted from the table. Locking should be reduced to using row-level locks, thereby preventing a table lock from causing blocking on other concurrent processes.

As an alternative, you could look at Read Committed Snapshot Isolation (RCSI), otherwise known as row-versioning, at the database level. RCSI prevents writers from block readers, however there are many possible impacts to using row versioning. You should thoroughly test the applications that use the database if you decide to go this route. Read up carefully and consider the other impacts on your application before blindly enabling RCSI.