Given a SQL Server table with
- a large number of rows
- no columns with large-value data types
- multiple indexes
- more allocated space than available for the largest possible transaction log size
- a single-column primary key with clustered index (optional consideration for this question)
- an average record size of 1k (optional consideration for this question)
and an update statement which
- needs to be run against every row
- sets a value on a non-indexed column (optional consideration for this question)
What techniques can be employed to reduce the peak disk space consuption (including data files, log file and tempdb – if applicable) required to do this update?
For purposes of this question, the following is allowed:
- applying changes in batches
- run in single-user mode
- change recovery model
Best Answer
I've just gone through a similar process just couple of weeks ago. After several tries, with couple of the bigger tables (one of them more than 100million rows, near 80Gb) I came up with these steps to speed up things and keep transaction log small:
ALTER DATABASE [my_db] SET RECOVERY SIMPLE WITH NO_WAIT (check here for more info)
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED (check here for more info)
drop nonclustered indexes: this step will avoid the hassle with the need of updating them on each operation (check Brent Ozar blog for more details)
use batches: as suggested by mrdenny, do whatever you need to do, update, delete, inserts, in smaller chunks.
Here is a sample of batch processing for an update, 1000 rows at a time:
restore nonclustered indexes
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
ALTER DATABASE [my_db] SET RECOVERY FULL;