We are attempting to update/delete a large number of records in a multi-billion row table. Since this is a popular table, there is a lot of activity in different sections of this table. Any large update/delete activity is being blocked for extended periods of time (as it is waiting to get locks on all the rows or page lock or table lock) resulting in timeouts or taking multiple days to complete the task.
So, we are changing the approach to delete small batch of rows at at time. But we want to check if the selected (let's say 100 or 1000 or 2000 rows) are currently locked by a different process or not.
- If not, then proceed with delete/update.
- If they are locked, then move on to the next group of records.
- At end, come back to the begining and attempt to update/delete the left out ones.
Is this doable?
Thanks,
ToC
Best Answer
If I understand the request correctly, the goal is to delete batches of rows, while at the same time, DML operations are occurring on rows throughout the table. The goal is to delete a batch; however, if any underlying rows contained within the range defined by said batch are locked, then we must skip that batch and move to the next batch. We must then return to any batches that were not previously deleted and retry our original delete logic. We must repeat this cycle until all required batches of rows are deleted.
As has been mentioned, it is reasonable to use a READPAST hint and the READ COMMITTED (default) isolation level, in order to skip past ranges that may contain blocked rows. I will go a step further and recommend using the SERIALIZABLE isolation level and nibbling deletes.
SQL Server uses Key-Range locks to protect a range of rows implicitly included in a record set being read by a Transact-SQL statement while using the serializable transaction isolation level...find more here: https://technet.microsoft.com/en-US/library/ms191272(v=SQL.105).aspx
With nibbling deletes, our goal is to isolate a range of rows and ensure that no changes will occur to those rows while we are deleting them, that is to say, we do not want phantom reads or insertions. The serializable isolation level is meant to solve this problem.
Before I demonstrate my solution, I would like to add that neither am I recommending switching your database's default isolation level to SERIALIZABLE nor am I recommending that my solution is the best. I merely wish to present it and see where we can go from here.
A few house-keeping notes:
To begin my experiment, I will set up a test database, a sample table, and I will fill the table with 2,000,000 rows.
At this point, we will need one or more indexes upon which the locking mechanisms of the SERIALIZABLE isolation level can act.
Now, let us check to see that our 2,000,000 rows were created
So, we have our database, table, indexes, and rows. So, let us set up the experiment for nibbling deletes. First, we must decide how best to create a typical nibbling delete mechanism.
As you can see, I placed the explicit transaction inside the while loop. If you would like to limit log flushes, then feel free to place it outside the loop. Furthermore, since we are in the FULL recovery model, you may wish to create transaction log backups more often while running your nibbling delete operations, in order to ensure that your transaction log can be prevented from growing outrageously.
So, I have a couple goals with this setup. First, I want my key-range locks; so, I try to keep the batches as small as possible. I also do not want to impact negatively the concurrency on my "gigantic" table; so, I want to take my locks and leave them as fast as I can. So, I recommend that you make your batch sizes small.
Now, I want to provide a very short example of this deletion routine in action. We must open a new window within SSMS and delete one row from our table. I will do this within an implicit transaction using the default READ COMMITTED isolation level.
Was this row actually deleted?
Yes, it was deleted.
Now, in order to see our locks, let us open a new window within SSMS and add a code snippet or two. I am using Adam Mechanic's sp_whoisactive, which can be found here: sp_whoisactive
Now, we are ready to begin. In a new SSMS window, let us begin an explicit transaction that will attempt to re-insert the one row that we deleted. At the same time, we will fire off our nibbling delete operation.
The insert code:
Let us kick off both operations beginning with the insert and followed by our deletes. We can see the key-range locks and exclusive locks.
The insert generated these locks:
The nibbling delete/select is holding these locks:
Our insert is blocking our delete as expected:
Now, let us commit the insert transaction and see what is up.
And as expected, all transactions complete. Now, we must check to see whether the insert was a phantom or whether the delete operation removed it as well.
In fact, the insert was deleted; so, no phantom insert was allowed.
So, in conclusion, I think the true intention of this exercise is not to try and track every single row, page, or table-level lock and try to determine whether an element of a batch is locked and would therefore require our delete operation to wait. That may have been the intent of the questioners; however, that task is herculean and basically impractical if not impossible. The real goal is to ensure that no unwanted phenomena arise once we have isolated the range of our batch with locks of our own and then precede to delete the batch. The SERIALIZABLE isolation level achieves this objective. The key is to keep your nibbles small, your transaction log under control, and eliminate unwanted phenomena.
If you want speed, then don't build gigantically deep tables that cannot be partitioned and therefore be unable to use partition switching for the fastest results. The key to speed is partitioning and parallelism; the key to suffering is nibbles and live-locking.
Please let me know what you think.
I created some further examples of the SERIALIZABLE isolation level in action. They should be available at the links below.
Delete Operation
Insert Operation
Equality Operations - Key-Range Locks on Next Key Values
Equality Operations - Singleton Fetch of Existent Data
Equality Operations - Singleton Fetch of Nonexistent Data
Inequality Operations - Key-Range Locks on Range and Next Key Values