SQL Server – How to Identify Query Filling Tempdb Transaction Log

sql serversql-server-2005sql-server-2008tempdbtransaction-log

I would like to know how to identify the exact query or stored proc which is actually filling up the transactional log of TEMPDB database.

Best Answer

From http://www.sqlservercentral.com/scripts/tempdb/72007/

;WITH task_space_usage AS (
    -- SUM alloc/delloc pages
    SELECT session_id,
           request_id,
           SUM(internal_objects_alloc_page_count) AS alloc_pages,
           SUM(internal_objects_dealloc_page_count) AS dealloc_pages
    FROM sys.dm_db_task_space_usage WITH (NOLOCK)
    WHERE session_id <> @@SPID
    GROUP BY session_id, request_id
)
SELECT TSU.session_id,
       TSU.alloc_pages * 1.0 / 128 AS [internal object MB space],
       TSU.dealloc_pages * 1.0 / 128 AS [internal object dealloc MB space],
       EST.text,
       -- Extract statement from sql text
       ISNULL(
           NULLIF(
               SUBSTRING(
                 EST.text, 
                 ERQ.statement_start_offset / 2, 
                 CASE WHEN ERQ.statement_end_offset < ERQ.statement_start_offset 
                  THEN 0 
                 ELSE( ERQ.statement_end_offset - ERQ.statement_start_offset ) / 2 END
               ), ''
           ), EST.text
       ) AS [statement text],
       EQP.query_plan
FROM task_space_usage AS TSU
INNER JOIN sys.dm_exec_requests ERQ WITH (NOLOCK)
    ON  TSU.session_id = ERQ.session_id
    AND TSU.request_id = ERQ.request_id
OUTER APPLY sys.dm_exec_sql_text(ERQ.sql_handle) AS EST
OUTER APPLY sys.dm_exec_query_plan(ERQ.plan_handle) AS EQP
WHERE EST.text IS NOT NULL OR EQP.query_plan IS NOT NULL
ORDER BY 3 DESC;

EDIT

As Martin pointed out in a comment, this would not find active transactions that are occupying space in tempdb, it will only find active queries that are currently utilizing space there (and likely culprits for current log usage). So there could be an open transaction but the actual query that causes the problem is no longer running.

You could change the inner join on sys.dm_exec_requests to a left outer join, then you will return rows for sessions that aren't currently actively running queries.

The query Martin posted...

SELECT database_transaction_log_bytes_reserved,session_id 
  FROM sys.dm_tran_database_transactions AS tdt 
  INNER JOIN sys.dm_tran_session_transactions AS tst 
  ON tdt.transaction_id = tst.transaction_id 
  WHERE database_id = 2;

...would identify session_ids with active transactions that are occupying log space, but you wouldn't necessarily be able to determine the actual query that caused the problem, since if it's not running now it won't be captured in the above query for active requests. You may be able to reactively check the most recent query using DBCC INPUTBUFFER but it may not tell you what you want to hear. You can outer join in a similar way to capture those actively running, e.g.:

SELECT tdt.database_transaction_log_bytes_reserved,tst.session_id,
       t.[text], [statement] = COALESCE(NULLIF(
         SUBSTRING(
           t.[text],
           r.statement_start_offset / 2,
           CASE WHEN r.statement_end_offset < r.statement_start_offset
             THEN 0
             ELSE( r.statement_end_offset - r.statement_start_offset ) / 2 END
         ), ''
       ), t.[text])
     FROM sys.dm_tran_database_transactions AS tdt
     INNER JOIN sys.dm_tran_session_transactions AS tst
     ON tdt.transaction_id = tst.transaction_id
         LEFT OUTER JOIN sys.dm_exec_requests AS r
         ON tst.session_id = r.session_id
         OUTER APPLY sys.dm_exec_sql_text(r.plan_handle) AS t
     WHERE tdt.database_id = 2;

You can also use the DMV sys.dm_db_session_space_usage to see overall space utilization by session (but again you may not get back valid results for the query; if the query is not active, what you get back may not be the actual culprit).

;WITH s AS
(
    SELECT 
        s.session_id,
        [pages] = SUM(s.user_objects_alloc_page_count 
          + s.internal_objects_alloc_page_count) 
    FROM sys.dm_db_session_space_usage AS s
    GROUP BY s.session_id
    HAVING SUM(s.user_objects_alloc_page_count 
      + s.internal_objects_alloc_page_count) > 0
)
SELECT s.session_id, s.[pages], t.[text], 
  [statement] = COALESCE(NULLIF(
    SUBSTRING(
        t.[text], 
        r.statement_start_offset / 2, 
        CASE WHEN r.statement_end_offset < r.statement_start_offset 
        THEN 0 
        ELSE( r.statement_end_offset - r.statement_start_offset ) / 2 END
      ), ''
    ), t.[text])
FROM s
LEFT OUTER JOIN 
sys.dm_exec_requests AS r
ON s.session_id = r.session_id
OUTER APPLY sys.dm_exec_sql_text(r.plan_handle) AS t
ORDER BY s.[pages] DESC;

With all of these queries at your disposal, you should be able to narrow down who is using up tempdb and how, especially if you catch them in the act.

some tips for minimizing tempdb utilization

use fewer #temp tables and @table variables
minimize concurrent index maintenance, and avoid the SORT_IN_TEMPDB option if it isn't needed
avoid unnecessary cursors; avoid static cursors if you think this may be a bottleneck, since static cursors use work tables in tempdb
try to avoid spools (e.g. large CTEs that are referenced multiple times in the query)
don't use MARS
thoroughly test the use of snapshot / RCSI isolation levels - don't just turn it on for all databases since you've been told it's better than NOLOCK (it is, but it isn't free)
in some cases, it may sound unintuitive, but use more temp tables. e.g. breaking up a humongous query into parts may be slightly less efficient, but if it can avoid a huge memory spill to tempdb because the single, larger query requires a memory grant too large...
avoid enabling triggers for bulk operations
avoid overuse of LOB types (max types, XML, etc) as local variables
keep transactions short and sweet
don't set tempdb to be everyone's default database -

You may also consider that your tempdb log usage may be caused by internal processes that you have little or no control over - for example database mail, event notifications, query notifications and service broker all use tempdb in some way. You can stop using these features, but if you're using them you can't dictate how and when they use tempdb.

Related Solutions

Find Transactions Filling Up the Version Store in SQL Server

It doesn't really make sense to track version store by session, or by transaction, or by query. If two different users are making use of the same version of a row/table, who owns it?

You can track this by object, though, which can help you narrow down which modules are causing the churn. Have a look at sys.dm_tran_top_version_generators:

USE [your database];
GO
SELECT obj = 
  QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id))
  + '.' + QUOTENAME(OBJECT_NAME(p.object_id)),
  vs.aggregated_record_length_in_bytes
FROM sys.dm_tran_top_version_generators AS vs
INNER JOIN sys.partitions AS p
ON vs.rowset_id = p.hobt_id
WHERE vs.database_id = DB_ID()
AND p.index_id IN (0,1);

And on SQL Server 2008+, you can also figure out which modules reference these tables by adding sys.dm_sql_referencing_entities:

SELECT 
  obj = QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id))
  + '.' + QUOTENAME(OBJECT_NAME(p.object_id)),
  referenced_by = QUOTENAME(r.referencing_schema_name)
  + '.' + QUOTENAME(r.referencing_entity_name),
  vs.aggregated_record_length_in_bytes AS size
FROM sys.dm_tran_top_version_generators AS vs
INNER JOIN sys.partitions AS p
ON vs.rowset_id = p.hobt_id
CROSS APPLY sys.dm_sql_referencing_entities
(
  QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id))
  + '.' + QUOTENAME(OBJECT_NAME(p.object_id)), 'OBJECT'
) AS r
WHERE vs.database_id = DB_ID()
AND p.index_id IN (0,1)
ORDER BY size DESC, referenced_by;

This assumes that none of the version store could be created by ad hoc queries. However, it doesn't tell you which of those modules could be causing it - hopefully the naming scheme is logical and helps you narrow it down a bit.

(On 2005 you might be able to go through sysdepends and other old-style dependency views but I'm not 100% sure how reliable that would be.)

Sql-server – difference between tempdb log and database log? In which circumstances these log files uses

MS SQL Server is an ACID compliant RDBMS. One of the critical tools involved in providing ACID compliance is a concept of a transaction log. The transaction log is where the pages modified is saved, so it can revert them and put them back in the previous state if you need to rollback a transaction. There is much to consider here but you want to ensure you size your VLFs and manage it in advance.

TempDB is a temporary area used by MS SQL Server to store objects like temp tables, join results, some caching, etc. Unless you have performance issues, you generally don't want to worry about this, don't even back it up. It get's rebuilt on startup. It is good practice to create at least 2 or maybe up to 4 physical files for TempDB so you don't get PAGELATCH contention.

Paul Randall (An authority on MS SQL Server) recommends letting your transaction log be at whatever size it grows to after a full weeks activity after reindexing. Really, unless it grows a lot, don't worry too much about it. Just make sure to avoid growths.

Best Answer

some tips for minimizing tempdb utilization

Related Solutions

Find Transactions Filling Up the Version Store in SQL Server

Sql-server – difference between tempdb log and database log? In which circumstances these log files uses

Related Question