Sql-server – Disk space full during insert, what happens

insertrollbackselect-intosql serversql-server-2016

Today I discovered the harddrive which stores my databases was full. This has happened before, usually the cause is quite evident. Usually there is a bad query, which causes huge spills to tempdb which grows till the disk is full. This time it was a bit less evident what happened, as tempdb wasn't the cause of the full drive, it was the database itself.

The facts:

  • Usual database size is about 55 GB, it grew to 605 GB.
  • Log file has normal size, datafile is huge.
  • Datafile has 85% available space (I interpret this as 'air': space that was used, but has been freed. SQL Server reserves all space once allocated).
  • Tempdb size is normal.

I have found the likely cause; there is one query which selects much too many rows (bad join causes selection of 11 billion rows where a couple of hundred thousand is expected). This is a SELECT INTO query, which made me wonder whether the following scenario could have happened:

  • SELECT INTO is executed
  • Target table is created
  • Data is inserted as it is selected
  • Disk fills up, causing the insert to fail
  • SELECT INTO is aborted and rolled back
  • Rollback frees up space (data already inserted is removed), but SQL Server doesn't release the freed up space.

In this situation, however, I wouldn't have expected the table created by the SELECT INTO to still exist, it should be dropped by the rollback. I tested this:

BEGIN TRANSACTION 
SELECT  T.x
INTO    TMP.test
FROM    (VALUES(1))T(x)

ROLLBACK

SELECT  * 
FROM    TMP.test

This results in:

(1 row affected)
Msg 208, Level 16, State 1, Line 8
Invalid object name 'TMP.test'.

Yet the target table does exist. The actual query wasn't executed in an explicit transaction though, can that explain the existence of the target table?

Are the assumptions I sketched here correct? Is this a likely scenario to have happened?

Best Answer

The actual query wasn't executed in an explicit transaction though, can that explain the existence of the target table?

Yes, exactly so.

If you do a simple select into outside of an explicit transaction, there are two transactions in autocommit mode: the first creates the table and the second fills it up.

You can prove it to yourself this way:

In a dedicated database on a test server in simple recovery model, first make a checkpoint and ensure that the log contains only a few rows (3 in case of 2016) related to checkpoint. Then run a select into of one row and check the log again, looking for a begin tran associated with select into:

checkpoint;

select *
from sys.fn_dblog(null, null);

select 'a' as col
into dbo.t3;  

select *
from sys.fn_dblog(null, null)
where Operation = 'LOP_BEGIN_XACT'
      and [Transaction Name] = 'SELECT INTO';

You'll get 2 rows, showing you had 2 transactions.

Are the assumptions I sketched here correct? Is this a likely scenario to have happened?

Yes, they are correct.

The insert part of select into was rolled back, but it does not release any data space. You can verify this by executing sp_spaceused; you'll see plenty of unallocated space.

If you want the database to release this unallocated space you should shrink your data file(s).