we have a quite big MS SQL database with millions of rows.
I created a simple script to delete rows older than 1 month, but this seems to lock the table and creates trouble for the application.
The table has an indexed "ID" PK, and also a "date" column which I will use for this task.
What is the best way to do this without causing the lock?
I was considering partitioning, but not sure if its the best way to go.
Thanks in advance.
Best Answer
You need to delete rows in batches to avoid lock escalation. Lock escalation happens when SQL Server switches from row or page locks to locking the entire table. Lock escalation conserves memory when SQL Server detects a large number of row or page locks have been taken, and more are needed to complete the operation. This may be why you're having blocking issues. If I recall correctly, row locks escalate to table locks when more than 5,000 rows are involved.
If the ID column is also the clustered index you might benefit from using the ID column to perform the delete instead of using the date column. To do that, obtain the ID value for the date you want to work with, then loop to delete fewer than 5,000 rows per batch, where the ID value is less than or equal to the desired ID. This is only applicable assuming rows are inserted in date order, such that incrementing ID values match incrementing date values.
You can determine how many rows were affected by the
DELETE
by checking the@@rowcount
variable, and you can set a maximum number of rows to be affected usingSET ROWCOUNT
, then loop until yourDELETE
no longer removes rows:An alternate method that eliminates the use of the deprecated
SET ROWCOUNT
statement is to add aTOP
clause to theDELETE
query, as in:In the code above, I've added a query to obtain the maximum ID value for rows on or before January 4th, 2021, simply to illustrate how this works.
Both methods above result in the same outcome, all rows prior to the given
@ID
value are deleted from the table. Locking should be reduced to using row-level locks, thereby preventing a table lock from causing blocking on other concurrent processes.As an alternative, you could look at Read Committed Snapshot Isolation (RCSI), otherwise known as row-versioning, at the database level. RCSI prevents writers from block readers, however there are many possible impacts to using row versioning. You should thoroughly test the applications that use the database if you decide to go this route. Read up carefully and consider the other impacts on your application before blindly enabling RCSI.