Sql-server – Securely deleting data from SQL Server logs

sql serversql-server-2008-r2transaction-log

Is there a good way to purge data from a particular DB's *.LDF?

A little background… I'm a SysAd, not a DBA. I'm running SQL Server 2008 R2 for one of my applications and I'm not particularly concerned about the size of the logs. However, in order to meet certain DSS compliance, I occasionally have to purge certain data from the database and ensure that it is in no way recoverable.

I'm under the impression that if I go in and DELETE the particular entries I should be fine as far as the *.MDF is concerned. But I still need to make sure there is no way to recover the data from logs.

Best Answer

The SQL Server log contains all the data that was written to the database (in some form). So, you are right to be concerned about it.

The log is a ring buffer made up of virtual log files (VLF). Once a VLF is not needed anymore, it can be reused. However it will not be reused until the "write head" comes around to that VLF again.

The only difference between simple and full (with a few small exceptions) recovery model is how quickly a VLF is "not needed anymore". If you use full recovery mode, a VLF is "needed" at least until a log backup has been taken. (A full backup never clears the "needed" flag of a VLF.)

The only way to force a VLF to be overwritten is to first make sure that it is not needed anymore and then pump enough transactions into the database that the VLF gets reused. That however is probably not acceptable to you.

Another way to force the data out is to shrink the log file. If you can take downtime, follow this process:

  1. Set the database to READ_ONLY
  2. Run the CHECKPOINT command.
  3. take a log backup (if in full recovery)
  4. make sure other log-readers (replication, CDC, ...) have caught up
  5. set the database to READ_WRITE
  6. create a new log file
  7. empty the original file migrating the data to the new (DBCC SHRINKFILE, there should not be any data left to migrate at this point, but you still need to execute this step.)
  8. drop the old log file
  9. take another log backup (if in full recovery)

You still need to test this, but this should get you to where you need to be.

If you can't take downtime, follow the same process without setting the database to READ_ONLY. However, you run the risk not to be able to have a complete clean cutover, but if you do this regularly you still should be OK.


Note, as some of the commenters have mentioned, a delete does not actually remove the data from the MDF files either. However, that can be dealt with in a similar fashion:

  1. rebuild all indexes on the offending table(s) (or better all tables)
  2. create a new file
  3. empty the old file, migrating the data to the new file using DBCC SHRINKFILE
  4. drop the old file

Step 1 is necessary as the file migration does not change page contents, so single deleted rows might survive the move. Rebuilding the indexes should use new pages.


Additional notes:

The log cleanup has to happen after the data file cleanup as there is a chance that the data file cleanup moves some of the data into the log file.

Index rebuild operations and also a lot of data access operations utilize tempdb, so you have to shred that too. Tempdb is recreated on server restart, so you can stop the SQL Server service, delete the tempdb files and then start the service again.

Finally, this all causes a lot of blocks (~ database pages) to be unlinked from existing database files. This unlinking (due to file shrinkage or the deletion of files) does not force those blocks to be overwritten. So it is recommended to create the new files in the above processes on a separate disk and afterwards securely wipe the original disk.