Sql-server – Large delete query seems to have frozen

deletesql serversql-server-2008-r2

We ran a delete query on a database with 1.8bn rows. This delete would delete 1.2bn rows.

In hindsight, we would have broken up this query into 100m at a time but we are in a position where it has been running for 24 hours and the log file is at 2Tb which appears to be the maximum size allowed for a log file.

The database is in SIMPLE recovery mode.

Is there any saving this query? Or do we need to just restart the SQL Server and see what happens? Will the database be unusable? Is there anything we can do to kill this off as cleanly as possible?

Best Answer

First of all, check the SQL errorlog to see if it actually hit a max size for the log. If it did, then the query has no hope of completing, it is probably already in a rollback state.

Even if it is, I always prefer to kill the spid manually (use sp_who2 or sp_WhoIsActive to find the spid, then do a kill 59 or whatever). You also can't check the rollback status unless you do an explicit KILL, see this related thread.

Since this is a delete, and not an update or insert, you may be very lucky and find that it rolls back immediately. If not, it may take as long (or longer) to roll back as it did to get to this point.

To see the rollback status, use

kill 59 with statusonly

Unfortunately, I have found this frequently does not show anything useful, just a "0% complete". In that case, you'll have to use sp_who2 and watch the IO and CPU to see if it is still doing something.

Regarding the reboot, this is a grave risk. If the spid is actively rolling back (CPU and IO are changing), then restarting SQL will only take the database offline entirely until the rollback is completely finished (hours and hours). But, if the CPU and IO are not moving, then it may in fact clear it right away. Either way, it is a risk.

One final option, if things are especially dire: If you have a backup from just before the delete started (and there haven't been other updates to the db), then the fastest way to recover may be to simply drop the DB, restart SQL, and restore from backup.

If you can't drop the DB (or if you already restarted the instance and the sql errorlog is predicting a 24-hour recovery time), then shut down SQL services, delete the MDF and LDF files from disk, start up SQL, drop the (ghost) database, and restore from backup.

Obviously you'd only attempt that if this were a back-end processing database that users didn't interact with.