Sql-server – What are the most effective techniques to reduce SQL Server disk space use when updating all rows in a very large table

sql servertempdbtransaction-log

Given a SQL Server table with

  • a large number of rows
  • no columns with large-value data types
  • multiple indexes
  • more allocated space than available for the largest possible transaction log size
  • a single-column primary key with clustered index (optional consideration for this question)
  • an average record size of 1k (optional consideration for this question)

and an update statement which

  • needs to be run against every row
  • sets a value on a non-indexed column (optional consideration for this question)

What techniques can be employed to reduce the peak disk space consuption (including data files, log file and tempdb – if applicable) required to do this update?

For purposes of this question, the following is allowed:

  • applying changes in batches
  • run in single-user mode
  • change recovery model

Best Answer

I've just gone through a similar process just couple of weeks ago. After several tries, with couple of the bigger tables (one of them more than 100million rows, near 80Gb) I came up with these steps to speed up things and keep transaction log small:

Here is a sample of batch processing for an update, 1000 rows at a time:

 UPDATE TOP(1000) your_table
 SET    col1 = new_value
 WHERE  <your_condition>
 WHILE  @@rowcount > 0
 BEGIN
     UPDATE TOP(1000) your_table
     SET    col1 = new_value
     WHERE  <your_condition>;
 END;
 GO
  • restore nonclustered indexes

  • SET TRANSACTION ISOLATION LEVEL READ COMMITTED

  • ALTER DATABASE [my_db] SET RECOVERY FULL;