Sql-server – kind of deadlock that SQL Server won’t kill automatically

sql serversql server 2014

We've been having some blocking problems on SQL Server 2014.

First, let me test my knowledge:

A deadlock, as I understand it, is when process A blocks process B which then also blocks process A, creating a cycle in the graph of blocked processes. It was my understanding that, when this occurs, SQL Server can recognize it and kill one of the processes, the deadlock victim. I assume SQL Server can also recognize a cycle of three or more processes, too, but maybe not?

What's not a deadlock is when a process locks a resource, then some other processes need it and waits for it to become unlocked. If the first process never completes, then the other processes will just wait (until they timeout). SQL Server will not kill any processes in this case because blocking is normal.

Now, are there any kind of deadlocks that SQL Server either can't detect or will choose not to kill?

We have some users running heavy queries, a job that runs every 5 minutes to update the tables, and a second job that runs a small query to check the health of the first job. (It checks the most recent date and sends an email if the data is stale.)

My DBA tells me that some combination of the above three things is creating a deadlock, but that SQL Server won't kill it. Not a normal block, but an actual deadlock. Is that possible?

Best Answer

As an anecdote that may or may not shed some light on your particular issue, we've had a similar problem to this where an external application would execute a stored procedure and improperly roll back when an error was encountered.

This procedure was vendor provided and unable to be modified. Adding to this, XACT_ABORT was set to the default value of OFF and could not be modified.

The application would reach a CommandTimeout event (due to SqlException being thrown and the application not handling this appropriately), but the session would remain open.

Since XACT_ABORT is OFF and explicit transaction statements were used in the SP without appropriate error handling, the session would maintain any locks held by the aborted procedure. The session itself would be in a "SLEEPING" status with "AWAITING COMMAND".

This was a somewhat insidious issue for us, because the failed session would not be immediately obvious. However, contention would rapidly grow in the way that a vehicle stopping in the middle of a busy intersection will cause a multi-car incident. Unrelated applications attempting to access the locked resources held by the now-orphaned session would simply WAIT, entering a SUSPENDED status while potentially holding locks of their own, creating a scenario of branching contention and lock escalation.

While there are certainly other methods, this was a very busy application database which limited the usefulness of more direct methods which would have been optimal in a more controlled or quieter environment. In order to quickly and easily identify the ROOT session responsible, we cobbled together the following query, which returns the contention "tree" and allowed us to KILL the offending session (once appropriately identified).

USE master
GO;

WITH CTE_Blks
AS (
 SELECT Blocked [SPID]
 FROM sysprocesses
 WHERE Blocked <> 0
 )
 , CTE_Tree
AS (
 SELECT DB_NAME(sp.dbid) AS [Database]
  , spid AS [Session]
  , Blocked [BlockedBy]
  , sp.STATUS AS [Status]
  , cmd [Command]
  , loginame AS [Login]
  , hostname AS [Host]
  , program_name AS [Program]
  , ec.client_Net_Address [IPv4]
  , ec.Client_TCP_Port AS [Port]
  , Last_Batch
  , ST.Query [Transaction]
  , ST2.Query AS [MostRecentStmnt]
 FROM sysprocesses SP
 LEFT JOIN sys.dm_exec_connections ec ON ec.session_id = sp.spid
 CROSS APPLY (
  SELECT TEXT AS [processing-instruction(x)]
  FROM sys.dm_exec_sql_text(SP.sql_handle)
  FOR XML PATH('')
   , TYPE
  ) AS ST(Query)
 CROSS APPLY (
  SELECT TEXT AS [processing-instruction(x)]
  FROM sys.dm_exec_sql_text(ec.most_recent_sql_handle)
  FOR XML PATH('')
   , TYPE
  ) AS ST2(Query)
 WHERE blocked <> 0
  OR spid IN (
   SELECT SPID
   FROM CTE_Blks
   )
 )
SELECT CASE 
  WHEN BlockedBy = 0
   THEN 'ROOT'
  WHEN BlockedBy <> 0
   AND EXISTS (
    SELECT 1
    FROM CTE_Tree s1
    WHERE s1.BlockedBy = t1.Session
    )
   THEN 'BRANCH'
  ELSE 'LEAF'
  END [NodeType]
 , t1.*
FROM CTE_Tree t1
ORDER BY [NodeType]

To add some explanation to this:

  1. ROOT is obviously the orphaned session and should be KILLed.
  2. A LEAF is a blocked session which is not, itself, blocking other transactions.
  3. A BRANCH is a blocked session which, due to its own locks, is blocking another session--either another BRANCH in the chain, or a LEAF (terminal).

The use of returning our sql_handle text as an XML processing-instruction may seem odd, but was deliberate to circumvent truncation, preserve formatting, and bypass the need to explicitly escape characters.