Sql-server – Why would SQL Server remain unresponsive after a slow query completes

sql serversql-server-2012ssms

I've just finished debugging a problematic SQL Server 2012 table-valued function. It was declared to use a unique column in the table:

CREATE FUNCTION [dbo].[myApp_getUserIdsFromXml]
    (
    @searchXml xml
    )
RETURNS @tbl_found TABLE
    (
    userId int UNIQUE
    )
AS ...

The function returns some ID values based on XML data. It was being OUTER APPLIED by various procedures without issue, until in one situation it returned about 60,000 rows, in which case the procedure execution time took 18 minutes instead of a fraction of a second.

The function was debugged and redeclared as:

RETURNS @tbl_found TABLE
    (
    userId int PRIMARY KEY /* <-- PK fixed the issue */
    )
AS ...

The execution time then dropped to 0.68 seconds.

Before debugging and altering it, even after the query had completed (18 minutes' worth), SQL Server remained extremely unresponsive. Simple queries took minutes rather than milliseconds, and the SSMS GUI continuously froze and showed Windows' not responding message. This lasted for about an hour each time before SQL seemed to assume normal speeds.

Can anyone suggest a reason why this might happen (e.g. memory not being freed, or something). I've been using SQL for over ten years and never seen behaviour like this (the behaviour was replicated on two servers).

Or, can anyone suggest TSQL code to help investigate different areas where the problem might lie? I've found code to test for locks, but there was nothing 'locked' here, the transaction had completed and nothing was being executed (by my apps) at the time.

It's obviously fixed now, but I've lost too much hair in the process…

Best Answer

Multi statement table valued functions are well known performance killers. I think the change you introduced had the beneficial side effect of turning your table variable from heap to clustered index. Other than that, I see no other difference between the two implementations.

I have no precise explanation of what could have "freezed" SQL Server, but that unresponsiveness is usually associated with high CPU. Did you take figures of what the CPU looked like during that period?

Another thing maybe worth investigating is the number of cache objects created in tempdb. Paul White has a great post on the subject on his blog. I would expect the numbers to be unchanged between heaps and CIs to be honest.

Another thing worth looking into is the preminent wait class during the "freeze". That could tell a lot about what's going on behind the covers. If you want to collect it over time you can use the Data Collector or a custom script.

BTW, if you managed to turn that function into an Inline Table Valued Function, that would probably help with performance.