SQL Server – How to Identify Query Filling Tempdb Transaction Log

sql serversql-server-2005sql-server-2008tempdbtransaction-log

I would like to know how to identify the exact query or stored proc which is actually filling up the transactional log of TEMPDB database.

Best Answer

From http://www.sqlservercentral.com/scripts/tempdb/72007/

;WITH task_space_usage AS (
    -- SUM alloc/delloc pages
    SELECT session_id,
           request_id,
           SUM(internal_objects_alloc_page_count) AS alloc_pages,
           SUM(internal_objects_dealloc_page_count) AS dealloc_pages
    FROM sys.dm_db_task_space_usage WITH (NOLOCK)
    WHERE session_id <> @@SPID
    GROUP BY session_id, request_id
)
SELECT TSU.session_id,
       TSU.alloc_pages * 1.0 / 128 AS [internal object MB space],
       TSU.dealloc_pages * 1.0 / 128 AS [internal object dealloc MB space],
       EST.text,
       -- Extract statement from sql text
       ISNULL(
           NULLIF(
               SUBSTRING(
                 EST.text, 
                 ERQ.statement_start_offset / 2, 
                 CASE WHEN ERQ.statement_end_offset < ERQ.statement_start_offset 
                  THEN 0 
                 ELSE( ERQ.statement_end_offset - ERQ.statement_start_offset ) / 2 END
               ), ''
           ), EST.text
       ) AS [statement text],
       EQP.query_plan
FROM task_space_usage AS TSU
INNER JOIN sys.dm_exec_requests ERQ WITH (NOLOCK)
    ON  TSU.session_id = ERQ.session_id
    AND TSU.request_id = ERQ.request_id
OUTER APPLY sys.dm_exec_sql_text(ERQ.sql_handle) AS EST
OUTER APPLY sys.dm_exec_query_plan(ERQ.plan_handle) AS EQP
WHERE EST.text IS NOT NULL OR EQP.query_plan IS NOT NULL
ORDER BY 3 DESC;

EDIT

As Martin pointed out in a comment, this would not find active transactions that are occupying space in tempdb, it will only find active queries that are currently utilizing space there (and likely culprits for current log usage). So there could be an open transaction but the actual query that causes the problem is no longer running.

You could change the inner join on sys.dm_exec_requests to a left outer join, then you will return rows for sessions that aren't currently actively running queries.

The query Martin posted...

SELECT database_transaction_log_bytes_reserved,session_id 
  FROM sys.dm_tran_database_transactions AS tdt 
  INNER JOIN sys.dm_tran_session_transactions AS tst 
  ON tdt.transaction_id = tst.transaction_id 
  WHERE database_id = 2;

...would identify session_ids with active transactions that are occupying log space, but you wouldn't necessarily be able to determine the actual query that caused the problem, since if it's not running now it won't be captured in the above query for active requests. You may be able to reactively check the most recent query using DBCC INPUTBUFFER but it may not tell you what you want to hear. You can outer join in a similar way to capture those actively running, e.g.:

SELECT tdt.database_transaction_log_bytes_reserved,tst.session_id,
       t.[text], [statement] = COALESCE(NULLIF(
         SUBSTRING(
           t.[text],
           r.statement_start_offset / 2,
           CASE WHEN r.statement_end_offset < r.statement_start_offset
             THEN 0
             ELSE( r.statement_end_offset - r.statement_start_offset ) / 2 END
         ), ''
       ), t.[text])
     FROM sys.dm_tran_database_transactions AS tdt
     INNER JOIN sys.dm_tran_session_transactions AS tst
     ON tdt.transaction_id = tst.transaction_id
         LEFT OUTER JOIN sys.dm_exec_requests AS r
         ON tst.session_id = r.session_id
         OUTER APPLY sys.dm_exec_sql_text(r.plan_handle) AS t
     WHERE tdt.database_id = 2;

You can also use the DMV sys.dm_db_session_space_usage to see overall space utilization by session (but again you may not get back valid results for the query; if the query is not active, what you get back may not be the actual culprit).

;WITH s AS
(
    SELECT 
        s.session_id,
        [pages] = SUM(s.user_objects_alloc_page_count 
          + s.internal_objects_alloc_page_count) 
    FROM sys.dm_db_session_space_usage AS s
    GROUP BY s.session_id
    HAVING SUM(s.user_objects_alloc_page_count 
      + s.internal_objects_alloc_page_count) > 0
)
SELECT s.session_id, s.[pages], t.[text], 
  [statement] = COALESCE(NULLIF(
    SUBSTRING(
        t.[text], 
        r.statement_start_offset / 2, 
        CASE WHEN r.statement_end_offset < r.statement_start_offset 
        THEN 0 
        ELSE( r.statement_end_offset - r.statement_start_offset ) / 2 END
      ), ''
    ), t.[text])
FROM s
LEFT OUTER JOIN 
sys.dm_exec_requests AS r
ON s.session_id = r.session_id
OUTER APPLY sys.dm_exec_sql_text(r.plan_handle) AS t
ORDER BY s.[pages] DESC;

With all of these queries at your disposal, you should be able to narrow down who is using up tempdb and how, especially if you catch them in the act.

some tips for minimizing tempdb utilization

  1. use fewer #temp tables and @table variables
  2. minimize concurrent index maintenance, and avoid the SORT_IN_TEMPDB option if it isn't needed
  3. avoid unnecessary cursors; avoid static cursors if you think this may be a bottleneck, since static cursors use work tables in tempdb
  4. try to avoid spools (e.g. large CTEs that are referenced multiple times in the query)
  5. don't use MARS
  6. thoroughly test the use of snapshot / RCSI isolation levels - don't just turn it on for all databases since you've been told it's better than NOLOCK (it is, but it isn't free)
  7. in some cases, it may sound unintuitive, but use more temp tables. e.g. breaking up a humongous query into parts may be slightly less efficient, but if it can avoid a huge memory spill to tempdb because the single, larger query requires a memory grant too large...
  8. avoid enabling triggers for bulk operations
  9. avoid overuse of LOB types (max types, XML, etc) as local variables
  10. keep transactions short and sweet
  11. don't set tempdb to be everyone's default database -

You may also consider that your tempdb log usage may be caused by internal processes that you have little or no control over - for example database mail, event notifications, query notifications and service broker all use tempdb in some way. You can stop using these features, but if you're using them you can't dictate how and when they use tempdb.