Sql-server – What are the most effective techniques to reduce SQL Server disk space use when updating all rows in a very large table

sql servertempdbtransaction-log

Given a SQL Server table with

a large number of rows
no columns with large-value data types
multiple indexes
more allocated space than available for the largest possible transaction log size
a single-column primary key with clustered index (optional consideration for this question)
an average record size of 1k (optional consideration for this question)

and an update statement which

needs to be run against every row
sets a value on a non-indexed column (optional consideration for this question)

What techniques can be employed to reduce the peak disk space consuption (including data files, log file and tempdb – if applicable) required to do this update?

For purposes of this question, the following is allowed:

applying changes in batches
run in single-user mode
change recovery model

Best Answer

I've just gone through a similar process just couple of weeks ago. After several tries, with couple of the bigger tables (one of them more than 100million rows, near 80Gb) I came up with these steps to speed up things and keep transaction log small:

ALTER DATABASE [my_db] SET RECOVERY SIMPLE WITH NO_WAIT (check here for more info)
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED (check here for more info)
drop nonclustered indexes: this step will avoid the hassle with the need of updating them on each operation (check Brent Ozar blog for more details)
use batches: as suggested by mrdenny, do whatever you need to do, update, delete, inserts, in smaller chunks.

Here is a sample of batch processing for an update, 1000 rows at a time:

 UPDATE TOP(1000) your_table
 SET    col1 = new_value
 WHERE  <your_condition>
 WHILE  @@rowcount > 0
 BEGIN
     UPDATE TOP(1000) your_table
     SET    col1 = new_value
     WHERE  <your_condition>;
 END;
 GO

restore nonclustered indexes
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
ALTER DATABASE [my_db] SET RECOVERY FULL;

Best Answer

Related Solutions

Sql-server – Best practice for tempdb log file

Sql-server – SQL 2008 Server – performance loss possibly connected with a very large table

Related Question