Sql-server – DDL contention on TempDB

ddlperformancesql serversql-server-2005tempdb

I have a SQL Server 2005 Standard x64 that is experiencing issues with TempDB DDL contention for the past few months. The server will experiencing contention on wait resource 2:1:103 (the wait type being PAGELATCH_EX).

The issue appears to happen sporadically when the server is under decent load. I've been monitoring "Temp Tables for Destruction" rate and it can jump to 5,000+ during times when we have issues of PAGELATCH_EX on 2:1:103. From what I've read this counter should be 0 the majority of the time, but ours seems to stay anywhere from 300-1100 the majority of the time. The counter only goes to 0 when there is very few users on the system.

How can I narrow down what is causing the DDL contention on tempdb without having to look for a needle in a hay stack?

Best Answer

I've seen this very issue and the hotfix that was ultimately released to fix it was actually a direct result of my case with Microsoft CSS. There is no public KB article for the fix. Please make sure you've applied Service Pack 4 and the most recent cumulative update to SQL Server (at the time of writing, that's Cumulative Update #3 (9.00.5259)).

Until the hotfix was released, Microsoft's suggestion was to simply stop creating #temp tables (much like KB #916086). Since this would have meant a substantial re-write of dozens and dozens of reporting procedures, the workaround in my case (regardless of trace flags or temp file layout) was to restart our cluster every other weekend. Yuck.

In order to track down tempdb usage, there are several scripts around that can help, e.g. see Adam Machanic's sp_whoIsActive, specifically:

And also this script (and ones in the comments) from @SQLSoldier:

I would make sure all your cursors are using LOCAL STATIC READ_ONLY FORWARD_ONLY (see this and this), and see if there are any known expensive queries that make extensive use of #temp tables / @table variables, CTEs, or may contain unnecessary sorts or lead to hash joins... all of which can contribute to the problem (I doubt you'll find one golden cause). The easiest sweeping fix as a "bang-for-your-buck" starting point will be to use proper and inexpensive cursor options instead of the defaults.

In the meantime I would (a) install CU#3 and (b) call PSS. Tell them you are after a very specific fix that has already been confirmed as a bug and released to other users as a private hotfix: "VSTS #109112 - Temp table deferred drop doesn't scale for certain workloads." You may have to pay the case fee initially but, since it is a bug, the charge should be refunded.