Sql-server – Why does the SQL Server database triple in size after deleting millions of records

sql-server-2005sql-server-2008

We have two database tables ErrorLog and Audit which have both grown fairly large over the last couple of years. So in an effort to reduce the size of the database, I have to write a script to clean out all the rows older than 6 months from these two tables.

This is what I came up with:

ALTER DATABASE StudioWebTest SET RECOVERY SIMPLE;

DELETE FROM [Audit] WHERE AuditDate < DATEADD(M, -6, GETDATE());
DELETE FROM [ErrorLog] WHERE ErrorDate < DATEADD(M, -6, GETDATE());

ALTER DATABASE StudioWebTest SET RECOVERY FULL;

On my development machine (SQL Server 2008 R2) with or without changing the recovery mode the database size grows 2-3 times in size. However if I follow this up immediately with a shrink command it cuts the size of database back to half the original size (prior to the record deletion).

However if I hold off on doing the shrink for a few days, the shrink is not nearly as effective, in that while the database size is reduced, it is still almost twice the size of what it was (before deleting the records). Not knowing too much about how SQL Server uses the space it has allocated I assume that this has something to do with it using the extra space that has been freed.

This would all be okay except for one thing. When we run this on our production test environment which uses SQL Server 2005, the shrink command does not reduce the size of the database back to half its original size.

As an alternative I have also tried using TRUNCATE in place of DELETE but this doesn't appear to make much difference. The database still grows massively after the statements complete and I still have to shrink it for the same result. We haven't tried this on the production test machine yet, but as the shrink command doesn't seem to be cutting it, it seems doubtful that this will yield any improvement.

Anyway, I was just wondering if someone could explain why the database grows by so much despite changing the error recovery setting? How can this be prevented? Or possibly suggest an alternative means for reducing the size of the database if this is not the best way to go about this.

UPDATE:

I was just doing some more tests using TRUNCATE it doesn't seem to be increasing the size now (maybe I imagined it). I do still need to shrink the database to see a reduction in overall size though. I might try this on the production test server. At the end of the day I think my managers will be happy as long as the size goes down to some degree.

Best Answer

You are doing two giant deletes, each one occurring inside an implied transaction. Each record being deleted is first written to the transaction log.

Because you have changed the database to be in simple mode, the transaction log will be truncated on checkpoint, but you are still causing it to grow during each delete statement.

I suggest you break up the deletes into chunks. Inside a loop, you can delete one day at a time, starting with the oldest data, stopping at the 6 month mark. (make sure you have indexes that support your delete criteria, create them if you have to, drop them afterwards).

This way, the checkpoints will occurr more frequently, thus keeping the size of the transaction log down to a minimum.

Truncating a table will NOT grow the database.

Also, before you issue the delete statements, make sure you find out if any triggers exist on that table.