Sql-server – Index creation leads to large transaction log growth

indexsql serversql server 2014transaction-log

I have an approximately 1 billion row table. I then create an index as follows:

CREATE NONCLUSTERED INDEX [IX_Index1] 
ON [dbo].[MyTable]( [CustomerID] ASC )  -- CustomerID is uniqueidentifier
WITH (DATA_COMPRESSION=PAGE);

This results in a 9.15GB index (compressed), however the transaction log went from empty to 158GB while the index was being created (no other DB activity). Why is the amount of log space used that much higher than the resulting size of the index?

The database is SQLServer 2014 Enterprise in full recovery mode in an availability group.

Update
I dropped and recreated the index with the option SORT_IN_TEMPDB = ON. The resulting transaction log writes were 10.4GB for the DB's tlog and 1.27GB for the tempdb tlog.

Best Answer

Referring to MS documentation here. Also Creating a 60GB Index on SQLFool.

What I believe is happening is that when the index is being built with SORT_IN_TEMDB=OFF, the entire index build processed is logged via the DB's transaction log (DB is set to full recovery). (In addition, I expect there to be a high number of page splits due to the scattergun-nature of the GUID column that is being indexed)

When the index is built using SORT_IN_TEMPDB=ON, then most of the transaction log activity is going to be against tempdb which is in SIMPLE recovery mode.