SQL Server SELECT INTO Statement Progress – How to Monitor?

insertsql serversql server 2014

Our ETL flow has a long-running SELECT INTO-statement, that's creating a table on the fly, and populating it with several hundred million records.

The statement looks something like SELECT ... INTO DestTable FROM SrcTable

For monitoring purposes, we would like to get a rough idea of the progress of this statement, while it is executing (approx. rowcount, written number of bytes, or similar).

We tried the following to no avail:

-- Is blocked by the SELECT INTO statement:
select count(*) from DestTable with (nolock)

-- Returns 0, 0:
select rows, rowmodctr
from sysindexes with (nolock)
where id = object_id('DestTable')

-- Returns 0:
select rows
from sys.partitions
where object_id = object_id('DestTable')

Furthermore, we can see the transaction in sys.dm_tran_active_transactions, but I was not able to find a way to get the count of affected rows on a given transaction_id (something similar to @@ROWCOUNT perhaps, but with the transaction_id as argument).

I understand that on SQL Server the SELECT INTO-statement is both a DDL and a DML statement in one, and as such, the implicit table creation will be a locking operation. I still think there must be some clever way to obtain some kind of progress information while the statement is running.

Best Answer

I suspect that rows in sys.partitions is 0 due to not being committed yet. But this does not mean that SQL Server is unaware of what will go there if the Transaction does commit. The key is in remembering that all operations go through the Buffer Pool (i.e. memory) first, regardless of COMMIT or ROLLBACK of the operation. Hence, we can look in sys.dm_os_buffer_descriptors for that info:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT  --OBJECT_NAME(sp.[object_id]) AS [TableName], sdobd.*, '---', sp.*, '---', sau.*
       SUM(sdobd.[row_count]) AS [BufferPoolRows],
       SUM(sp.[rows]) AS [AllocatedRows],
       COUNT(*) AS [DataPages]
FROM sys.dm_os_buffer_descriptors sdobd
INNER JOIN  sys.allocation_units sau
        ON sau.[allocation_unit_id] = sdobd.[allocation_unit_id]
INNER JOIN  sys.partitions sp
        ON  (   sau.[type] = 1
            AND sau.[container_id] = sp.[partition_id]) -- IN_ROW_DATA
        OR  (   sau.[type] = 2
            AND sau.[container_id] = sp.[hobt_id]) -- LOB_DATA
        OR  (   sau.[type] = 3
            AND sau.[container_id] = sp.[partition_id]) -- ROW_OVERFLOW_DATA
WHERE   sdobd.[database_id] = DB_ID()
AND     sdobd.[page_type] = N'DATA_PAGE'
AND     sp.[object_id] = (SELECT so.[object_id]
                          FROM   sys.objects so
                          WHERE  so.[name] = 'TestDump')

If you want to see the details, uncomment the first row of items in the SELECT list, comment out the remaining 3 lines.

I tested by running the following in one Session and then repeatedly running the query above in another.

SELECT so1.*
INTO   dbo.TestDump
FROM   sys.objects so1
CROSS JOIN sys.objects so2
CROSS JOIN sys.objects so3;