Problem: I have a massive table – occupies a whopping 9.5 TBs. I need to delete around 5 mil rows from it. The best method I have come up with deletes 1 million records in 2 hours! Since I can not run this during peak time, I get only 2 hour window everyday and deleting 5 mil rows would take me 4 working days at this rate.
Question: I was wondering if SQL Server has a feature where I could run 2 or more deletes on a table simultaneously. Possibly something like virtually partitioning the table – not sure.
I also want to add that running my script induces page locks but keeps the transaction log from growing. My script is very similar to the one suggested by vonPryz:
How to Delete From Large Tables in Batches Without Growing the Transaction Log by Greg M Lucas.
Best Answer
Let me try to summarise the answers as per the commenters above.
First of all, to achieve higher deletion speeds you can do three things:
Ad 1) Since deleting a large number of rows at a time hit the transaction log hard and causes locking - you likely want the number of deletes in a statement to be "small". I find that around 10K rows is a good number (and having table lock escalation turned off with
ALTER TABLE Foo SET LOCK_ESCALATION = Disabled
Ad 2) If your table is not already partitioned, this is not an option
Ad 3) DELETE in SQL Server is single threaded, so you need to run more than one at a time to get max speed. To run multiple deletes concurrently, you need a way to partition them so each parallel delete runs on its own set of keys and doesn't block with the others. Typically, you can use the primary key of the table to partition the delete.
For example, if you have an IDENTITY column called
key
on the table, you can first:Keep the Max - Min and Min somewhere (a table in tempdb) that you can quickly read from into variables
@IntervalSize
,@MinKey
respectively.Let us say you decide that you can allocate 4 cores for running deletion. You now run 4 queries, either from 4 x SSMS new query or four command prompt via SQLCMD, each doing this:
Pick @n = 0,1,2,3 for each of the concurrently running DELETES.