Sql-server – avoid TABLOCKX on DELETE

deletelockingsql serversql-server-2016

Goal:

Delete several billion records old records (roughly 600GB) across three tables with zero downtime.

Methodology and Background:

I plan to delete batches of records associated with 1 Million MyIDs at a time (ie MyID BETWEEN 1 AND 1000000). While the delete is running, two of the tables will be very hot but activity on the third table can be safely suspended. In the two hot tables, MyID is the clustering key. In the third, cold, table I have a non-clustered index on MyID.

Besides the DELETE operation, the other activity in the hot tables will consist of INSERT, probably several per second. MyID is an IDENTITY and the newly inserted records will not be in range of any of the DELETE batches.

If it helps understand potential performance, SELECT of an aggregate of a sample batch of these rows takes < 1 second on the tables clustered by MyID and about 2 seconds on the non-clustered table. I do not have a prod replica for this particular database to play in so I cannot say how long the deletes will take, however once I deploy this to DEV I'll have a better idea.

Side note: To address log size, I am wrapping these batches in transactions and upping the TLog backup frequency from 15 to 5 minutes. I have 150GB of available log space.

The Question

I read in the MS docs:

By default, a DELETE statement always acquires an exclusive (X) lock on the table it modifies, and holds that lock until the transaction completes.

I am very wary of using locking hints, but in this case can I safely use one to avoid TABLOCKX? Are there other ways besides a locking hint to avoid TABLOCKX?

Best Answer

Delete several billion records old records (roughly 600GB)

In this case it's may be more convenient to copy the rows that should remain rather than delete. Note that all 600Gb will move to your log file surely.

By default, a DELETE statement always acquires an exclusive (X) lock on the table it modifies, and holds that lock until the transaction completes.

This is wrong. SQL Server never starts with table lock unless the corresponding hint was specified.

By default, SQL Server will acquire the finest-grain lock possible, in order to attain the greatest concurrency. In most cases, this means SQL Server will acquire row (RID or KEY) locks. SQL Server can acquire hundreds or thousands of individual locks on data in a single table without causing any problems. In some cases, however, if SQL Server determines that a query will access a range of rows within a clustered index, it may instead acquire page locks. After all, if every row on a page is going to be accessed, it's easier to manage a single page lock than dozens, or hundreds, of row locks. In other cases, primarily when there is no usable index to help process a query, SQL Server may lock an entire table right at the beginning of processing a query.

In your case whith filtering on PK ranges: MyID BETWEEN 1 AND 1000000 it most probably will start with page locks. And here all depends on how many records fit in one page. If locking of 1000000 records with page granularity takes less than 5000 locks you'll be fine with page locks, if you'll have more locks per statement lock escalation will happen and your table will be completely locked.

...

In addition to escalating locks when an instance-wide threshold is crossed, SQL Server will also escalate locks when any individual session acquires more than 5,000 locks in a single statement. In this case, there is no randomness in choosing which session will get its locks escalated; it is the session that acquired the locks.

SQL Server Concurrency: Locking, Blocking and Row Versioning by K.Delaney

So only by testing and monitoring locks acquired you can find the optimal batch size and avoid lock escalation. You can approximately calculate your batch size if you know the average of rows per page in your table. Or you can force page locking by using PAGLOCK and even disable lock escalation on your table:

ALTER TABLE MyTable SET ( LOCK_ESCALATION = DISABLE )