Sql-server – TempDB Log File Growth – questions

monitoringperformancesql-server-2017storagetempdb

Problem:

On one of the production boxes, TempDB transaction LOG file grew unexpectedly, filling the drive
It was 32 GB but now it is 240 GB

Questions:

  1. What is the mechanism of TempDB log file usage ?

TempDB is in SIMPLE recovery model, what can delay TempDB log file truncation
(and cause its growth as a result) ?

  1. How can I know or be alerted when TempDB data files or log file are growing ?

Is there an Extended Event for this, or other means to know, and can alert be sent by email when growth happens ?

  1. Performance Monitor metrics

Log file may be 240 GB, but it is really empty at the moment
Is there a metric that can be used to monitor amount of used space INSIDE the TempDB data/log files at any given point in time ?

I can see there are Active Temp Tables, Temp Tables Creation Rate, Worktables Created/sec, etc. but these do not show exactly what I would like to see

  1. Drive space monitoring

Is there native tool in SQL Server or Windows, that can alert me, when the free space on the TempDB drive reaches below 20%, 10% ?

  1. How to find out historically which session or task filled the TempDB data or log file ?

I can see which user/system sessions CURRENTLY consume at TempDB by querying this: sys.dm_db_session_space_usage. But how can I know, which user/task consumed large portion of TempDB, say, 2 hours ago, without having to sit and query that view ?

  1. sys.dm_db_session_space_usage

user_objects_alloc_page_count – (minus) user_objects_dealloc_page_count = the difference will be what session is CURRENTLY consuming (in pages) in TempDB ? Or it does not work that way ?

If you know at least few of those questions, your help is much appreciated

Best Answer

Why might the tempdb log grow

The usual suspect for delaying tempdb log file truncation is a long-running transaction. You can run a query like this, while the problem is occurring, to see why the log can't be truncated:

SELECT log_reuse_wait, log_reuse_wait_desc 
FROM sys.databases d 
WHERE database_id = 2;

The reasons are described on this MS docs page.

Monitoring and alerting

Questions 2 - 5 can all be solved by buying a dedicated monitoring tool (like Sentry One or SolarWinds, etc).

You can certainly accomplish all of these things with "home grown" methods, but they will inevitably have gaps and failures that the monitoring tool vendors have had years to think of and address.

For instance, you can use Extended Events (XE) to monitor for file growth events - Aaron Bertrand has a post about doing that with the sqlserver.database_file_size_change event here: Removing the default trace – Part 2.

To alert on file growths, you could have a SQL Agent job that queries this event data (either directly, or through Aaron's views) periodically, like every 1 / 5 / 15 minutes, and then emails you if there have been any in that time frame.

Getting the space used inside data and log files, as well as disk space used / available, is discussed in a different Q&A on this site, so I won't duplicate it here: Query to report disk space allocation and used space

Again, the low-tech way of alerting on this would likely be through an Agent job.

SQL Server doesn't keep a log of who or what was using tempdb at a given time - you'd have to write DMV query results to a table, check them periodically, etc.

Current tempdb usage

This is included in the results of sp_WhoIsActive. You might consider using that directly, or at least basing the calculation on the relevant part of that proc.

You'll need to add together user_objects_alloc_page_count and internal_objects_alloc_page_count (which includes LOB variables and spills), then subtract to the dealloc version of both for the "current" state of things.

You'll note that sp_WhoIsActive also includes the data from dm_db_task_space_usage for this, as some allocations will show up there and not in the other view. You can see a detailed example on my blog, where a LOB parameter being streamed from a .NET application will show up in this view: Invisible tempdb Killer