Sql-server – SQL Server Exclusive(X) lock not consistently blocking Shared(S) lock on a resource

lockingsql server

My understanding of locking in SQL Server has been that if one process holds an exclusive (X) lock on a resource e.g.a key row in an index, another process cannot acquire a Shared (S) lock on that same resource and must wait for it to be released.

I have been trying to demo a fix I have for a deadlock between 2 different stored procedures accessing the same table, one using an explicit transaction to wrap around a SELECT and then a DELETE, and the other which just does a SELECT with no explicit transaction. These both happen under the READ COMMITTED transaction isolation level.

I go through the steps to emulate each of the procs in 2 SSMS query windows and query sys.dm_tran_locks to view the locks being held and waiting after each step.

I would like someone to explain why, although I do sometimes get the expected deadlock, I don't always, and I can see the X lock having been granted to one connection while I can still do a select from the other connection.

The demo is contrived and the reason for splitting the SELECT on the non-clustered index, to get the clustered key value, with the SELECT that reads other columns from the clustered index using the clustered key (and wrapping both selects in an explicit transaction with HOLDLOCK on the first) is to emulate the actual query plan of my real world query, which seeks on the non-clustered index and does a Key Lookup on the clustered index. I needed to show what happens when the DELETE query on the Read-write connection clashes with the SELECT on the Readonly connection

The test database does not allow SNAPSHOT ISOLATION or READ COMMITTED SNAPSHOT and both connections have been explicitly set to READ COMMITTED Transaction isolation level so shared locks should be required to read a row.

I have tested this on my local machine with SQL2017 CU9 (on 64 bit Windows 10 Enterprise build 16299), on a server with SQL2017 CU7 (on Windows Server 2016 Standard build 14393) and also a server with SQL2016 sp1-CU2 (on Windows Server 2012 R2 Standard build 9600). All were 64 bit Developer versions of SQL Server.

I've included the code to reproduce this below: –

Create the database and table then populate it

CREATE DATABASE LockingRepro;
GO
USE [LockingRepro];
GO

CREATE TABLE dbo.Person (   PersonId            INT         NOT NULL    CONSTRAINT PK_Person PRIMARY KEY CLUSTERED,
                            LoginId             VARCHAR(50) NOT NULL    
                         )   ON [PRIMARY];  


CREATE TABLE dbo.PersonSession (    PersonId            INT                 NOT NULL CONSTRAINT PK_PersonSession PRIMARY KEY CLUSTERED,
                                    SessionId           UNIQUEIDENTIFIER    NOT NULL,
                                    LastUpdated         DATETIME            NOT NULL,
                                    SessionExpiryDate   DATETIME            NULL
                        ) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_SessionId] ON dbo.PersonSession (SessionId)  ON [PRIMARY];
GO

ALTER TABLE dbo.PersonSession 
ADD CONSTRAINT [FK_PersonSession_Person] FOREIGN KEY (PersonId) REFERENCES dbo.Person (PersonId);
GO

GO
SET NOCOUNT ON;
DECLARE @i INT = 0;

WHILE @i < 1000
BEGIN
    SET @i += 1;
    INSERT INTO dbo.Person (PersonId, LoginId)
    VALUES  (@i, 'xxxxxxxxxxxx' + CAST(@i AS VARCHAR(5)));

    INSERT INTO dbo.PersonSession   (   PersonId, SessionId, LastUpdated, SessionExpiryDate)
    VALUES      (   @i, NEWID(), GETUTCDATE(), DATEADD(MINUTE, 30, GETUTCDATE()));


END
GO

--Get the SessionId guid to paste into the other queries
SELECT SessionId 
FROM dbo.PersonSession 
WHERE PersonId = 100

Paste this code into a new query window to emulate the read write proc steps

--Read-write query window

--Pre-test steps
USE [LockingRepro];
GO
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ COMMITTED 

SELECT @@SPID -- Grab the Spid to use in the lock querying window
------------------------------

--1
BEGIN TRAN

--2
    SELECT * --
    FROM dbo.PersonSession WITH (ROWLOCK, UPDLOCK, HOLDLOCK)
    WHERE SessionId = 'Paste the SessionId for PersonId 100 here'

--5
    DELETE d
    FROM dbo.PersonSession  AS d WITH (ROWLOCK)
    WHERE PersonId = 100


-- Clean up
IF @@TRANCOUNT > 0 ROLLBACK

Paste this code into a new query window to emulate the read only proc steps

-- Read only query window

--Pre-test steps
USE [LockingRepro];
GO

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
SET NOCOUNT ON
SELECT @@SPID
-----------------------------------------


--3
BEGIN TRAN

--4
SELECT PersonId
FROM    dbo.PersonSession WITH (ROWLOCK, HOLDLOCK)
WHERE   SessionId = 'Paste the SessionId for PersonId 100 here'


--6
SELECT  PersonId, LastUpdated, SessionId, SessionExpiryDate
FROM dbo.PersonSession WITH (ROWLOCK) 
WHERE PersonId = 100


-- Clean up
IF @@TRANCOUNT > 0 ROLLBACK

Paste this code into a separate query window and edit to use the SPIDs from the ReadOnly and ReadWrtie query windows

You will need to execute this query very quickly following the query in step 6 to be able to capture the locks before a rollback occurs due to a deadlock.

USE [LockingRepro];
GO

DECLARE @ReadOnlySpid   INT = nn, -- edit to specify the spids from the other query windows
        @ReadWriteSpid  INT = nn
SELECT  CASE    WHEN    request_session_id = @ReadOnlySpid THEN 
                    'ReadOnly'
                ELSE
                    'ReadWrite'
        END    AS [Connection],
        DB_NAME(resource_database_id) AS [Database],
        resource_type,
        CASE WHEN resource_database_id = DB_ID() THEN
                CASE    WHEN resource_type = 'OBJECT' THEN
                            OBJECT_SCHEMA_NAME(resource_associated_entity_id) + '.' + OBJECT_NAME(resource_associated_entity_id) 
                        WHEN resource_type IN( 'PAGE', 'KEY', 'EXTENT', 'RID', 'HOBT') THEN
                            (   SELECT OBJECT_SCHEMA_NAME(p.[object_id]) + '.' + OBJECT_NAME(p.[object_id])    
                                      + '.' + i.[name]
                                        + '  (Ptn:' + FORMAT(p.partition_number, '0') + ')'
                                FROM sys.partitions AS p
                                LEFT JOIN   sys.indexes AS i 
                                  ON    p.[object_id] = i.[object_id]
                                  AND   p.index_id = i.index_id
                                WHERE   [partition_id] = resource_associated_entity_id
                            )
                        WHEN resource_type = 'ALLOCATION_UNIT' THEN 
                            (   SELECT OBJECT_SCHEMA_NAME(p.[object_id]) + '.' + OBJECT_NAME(p.[object_id])    
                                        + '.' + i.[name]
                                        + '  (Ptn:' + FORMAT(p.partition_number, '0') + ' Alloc:' + au.type_desc COLLATE SQL_Latin1_General_CP1_CI_AS + ')'
                                FROM sys.partitions AS p
                                INNER JOIN  sys.allocation_units AS au
                                  ON    p.[partition_id] = au.container_id
                                LEFT JOIN   sys.indexes AS i 
                                  ON    p.[object_id] = i.[object_id]
                                  AND   p.index_id = i.index_id
                                WHERE   au.allocation_unit_id = resource_associated_entity_id
                            )
                    END
            ELSE
                CAST(resource_associated_entity_id AS VARCHAR(100))
        END AS [Entity],
        resource_associated_entity_id,
        resource_description,
        resource_lock_partition,
        request_mode,
        request_type,
        request_status,
        request_reference_count        
FROM    sys.dm_tran_locks
WHERE   request_session_id IN (@ReadOnlySpid, @ReadWriteSpid)
AND     resource_type <> 'DATABASE'
ORDER BY request_session_id

Example of the lock query output for the expected behaviour i.e. a deadlock occurs

Step 1 – the ReadWrite connection starts a transaction so no locks at this point

Step 2 – the ReadWrite connection executes the SELECT with using UPDLOCK and acquires the KEY level U locks on both the non-clustered and the clustered index

Step 3 – the ReadOnly connection starts a transaction – no change to the locks

Step 4 – the Readonly connection executes the SELECT with a HOLDLOCK – it only needs to access the non-clustered index and it acquires the S lock on the KEY in that index

Step 5 – the ReadWrite connection executes the DELETE using the clustered key
– it acquires an X lock on the clustered index key but is blocked waiting to convert its U lock on the non-clustered index key to an X lock because the ReadOnly connection holds an S lock on it

Step 6 – The ReadOnly connection executes the SELECT using the clustered index key and is blocked waiting to acquire an S lock on the clustered index key, which the ReadWrite holds an X lock on, and within a couple of seconds the deadlock is detected

Example of the lock query output for the unexpected behaviour
This is the case where the ReadOnly connection SELECT in step 6 returns its result and no deadlock or blocking occurs, even though the clustered index key has an X lock granted on it to the ReadWrite connection.

All the output for steps 1 to 5 were identical to the above expected behaviour example so I won't repeat that.

This is the output from Step 6 – it doesn't show an S lock on the clustered index key for the ReadOnly connection

The query returns the results quickly so I've not been able to capture whether it actually gets a shared lock or not but the query can be executed repeatedly while the X lock on the clustered index is clearly still held by the ReadWrite connection.

Best Answer

I set up the following Extended Events session to capture the locks taken during step 6, filtered to that spid (55 for me):

CREATE EVENT SESSION [locks] ON SERVER 
ADD EVENT sqlserver.lock_acquired(
    ACTION(sqlserver.session_id,sqlserver.sql_text)
    WHERE ([sqlserver].[session_id]=(55)))
ADD TARGET package0.ring_buffer
WITH (STARTUP_STATE=OFF)
GO

The lock taken was an Intent-Shared (IS) lock on the page (despite the ROWLOCK hint).

screenshot of lock_acquired event

This matches the X-locked object_id in the results of the sys.dm_tran_locks query you provided:

screenshot of all lock info

See Paul White's post "The Case of the Missing Shared Locks" for details, but this is a specific optimization that allows bypassing the blocking that you expect:

SQL Server contains an optimization that allows it to avoid taking row-level shared (S) locks in the right circumstances. Specifically, it can skip shared locks if there is no risk of reading uncommitted data without them.

The query in step 6 takes an IS lock on the page, skipping the normal first step of taking an S lock on the key (row).

It's possible that the intermittent nature of this behavior in your demo code is due to the fact the the ROWLOCK hint is not always used:

ROWLOCK is genuinely a hint, rather than a directive — the storage engine may or may not respect it.

Or there are other uncommitted changes on the data page (this isn't the case in your repro, but could be the case in your real scenario):

SQL Server cannot apply the locking optimization if there is an uncommitted change on the same page.

Incidentally, I haven't experienced the deadlock behavior at all on my laptop (SQL Server 2017 CU13).