Sql-server – Troubleshooting Deadlock, Victim holding on to shared lock

deadlocksql serversql server 2014

I am trying to troubleshoot the below deadlock situation
enter image description here

Looking at the deadlock xml, both session are using read committed isolation level.
Question is, in read committed isolation level, in this situation, why is the victim holding on to the shared lock? My understanding and I just tested it, is that while reading tables, it would acquire shared lock on page/row level and then release it immediately. In what situation in read committed isolation level, would a session want to hold on to shared lock on page that it had acquired?

All the locks are on a nonclustered index on a table.

<deadlock>
    <victim-list>
        <victimProcess id="process11634c19088" />
    </victim-list>
    <process-list>
        <process id="process11634c19088" taskpriority="0" logused="0" waitresource="PAGE: 11:13:52372905 " ownerId="57900000774" transactionname="SELECT" lasttranstarted="2018-10-12T09:48:05.360" XDES="0x3e7feedc00" lockMode="S" schedulerid="12" kpid="8544" status="suspended" spid="72" sbid="0" ecid="0" priority="0" trancount="0" lastbatchstarted="2018-10-12T09:48:05.360" lastbatchcompleted="2018-10-12T09:48:05.377" lastattention="1900-01-01T00:00:00.377" clientapp=".Net SqlClient Data Provider" hostname="hostname1" hostpid="6912" loginname="ENETTAD\SVC_APP" isolationlevel="read committed (2)" xactid="57900000774" currentdb="11" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056" databaseName="database1">
            <executionStack>
                <frame procname="database1.dbo.uspBatch_Search" line="88" stmtstart="5068" stmtend="7128" sqlhandle="0x03000b00fad68c201937da0036a9000001000000000000000000000000000000000000000000000000000000">
SELECT @TotalRowCount = COUNT(*)    
  FROM dbo.Batch     
  JOIN #child ce    
   ON ce.ECN = requestednce    
  WHERE BatchId = ISNULL(@BatchId, BatchId)    
   AND RequestedUsername = ISNULL(@RequestedUsername, RequestedUsername)    
   AND requestednce = ISNULL(@requestednce, requestednce)    
   AND BatchStatusId = ISNULL(@StatusId, BatchStatusId)    
   AND BatchStatusId != @NewStatusId   and  BatchStatusId != @CancelledStatusId    
   AND (  (@IsReportRequired = @true AND ReportBlobId IS NOT NULL)     
   OR (@IsReportRequired = @false))    
   AND ( (     
   (@DateType = @CreatedDate AND CreatedDate BETWEEN cast(@FromDate AS varchar(20)) AND cast(@ToDate AS varchar(20)) )    
    OR     
   (@IncludeCurrent = @true AND ( BatchStatusId IN (@QueuedStatus,@InProgressStatus,@ReportingStatus)     
      OR     
       CompletedDate BETWEEN cast(@FromDate AS varchar(20)) AND cast(@ToDate AS varchar(20)))  )    
  )     
  OR @DateType = @IgnoreDateRange    </frame>
                <frame procname="adhoc" line="1" stmtstart="386" sqlhandle="0x01000b005101d12e207be87f2100000000000000000000000000000000000000000000000000000000000000">
EXEC @RETURN_VALUE = [dbo].[uspBatch_Search] @ECN = @p0, @BatchId = @p1, @requestednce = @p2, @RequestedUsername = @p3, @DateType = @p4, @FromDate = @p5, @ToDate = @p6, @StatusId = @p7, @IsReportRequired = @p8, @IncludeCurrent = @p9, @StartRowIndex = @p10, @SortExpression = @p11, @MaximumRows = @p12, @TotalRowCount = @p13 OUTPUT    </frame>
                <frame procname="unknown" line="1" sqlhandle="0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000">
unknown    </frame>
            </executionStack>
            <inputbuf>
(@p0 int,@p1 int,@p2 int,@p3 varchar(8000),@p4 varchar(8000),@p5 datetime,@p6 datetime,@p7 int,@p8 tinyint,@p9 bit,@p10 int,@p11 varchar(8000),@p12 int,@p13 int output,@RETURN_VALUE int output)EXEC @RETURN_VALUE = [dbo].[uspBatch_Search] @ECN = @p0, @BatchId = @p1, @requestednce = @p2, @RequestedUsername = @p3, @DateType = @p4, @FromDate = @p5, @ToDate = @p6, @StatusId = @p7, @IsReportRequired = @p8, @IncludeCurrent = @p9, @StartRowIndex = @p10, @SortExpression = @p11, @MaximumRows = @p12, @TotalRowCount = @p13 OUTPUT   </inputbuf>
        </process>
        <process id="process2e950984e8" taskpriority="0" logused="10880" waitresource="PAGE: 11:13:14044552 " waittime="241" ownerId="57900000848" transactionname="user_transaction" lasttranstarted="2018-10-12T09:48:05.367" XDES="0xf7c3c0c3b0" lockMode="IX" schedulerid="2" kpid="5476" status="suspended" spid="234" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2018-10-12T09:48:05.370" lastbatchcompleted="2018-10-12T09:48:05.367" lastattention="1900-01-01T00:00:00.367" clientapp=".Net SqlClient Data Provider" hostname="hostname1" hostpid="6912" loginname="ENETTAD\SVC_APP" isolationlevel="read committed (2)" xactid="57900000848" currentdb="11" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056" databaseName="database1">
            <executionStack>
                <frame procname="database1.dbo.uspBatch_Update" line="32" stmtstart="1934" stmtend="2586" sqlhandle="0x03000b00b243467b8df206000ba7000001000000000000000000000000000000000000000000000000000000">
UPDATE Batch      
            SET  BatchStatusId = @BatchStatusId,      
            ReportBlobId = @ReportBlobId,      
            CompletedDate = @CompletedDate,    
            ImportReportVersion = @ImportReportVersion,    
            ExportReportVersion = @ExportReportVersion,  
            ErrorCode = @ErrorCode  
            WHERE BatchId  = @BatchI    </frame>
                <frame procname="adhoc" line="1" stmtstart="236" sqlhandle="0x01000b009c24d728c05a7d06ab00000000000000000000000000000000000000000000000000000000000000">
EXEC @RETURN_VALUE = [dbo].[uspBatch_Update] @RequestId = @p0, @BatchId = @p1, @BatchStatusId = @p2, @ReportBlobId = @p3, @CompletedDate = @p4, @ErrorCode = @p5, @ImportReportVersion = @p6, @ExportReportVersion = @p7    </frame>
                <frame procname="unknown" line="1" sqlhandle="0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000">
unknown    </frame>
            </executionStack>
            <inputbuf>
(@p0 uniqueidentifier,@p1 int,@p2 int,@p3 varchar(8000),@p4 datetime,@p5 int,@p6 int,@p7 int,@RETURN_VALUE int output)EXEC @RETURN_VALUE = [dbo].[uspBatch_Update] @RequestId = @p0, @BatchId = @p1, @BatchStatusId = @p2, @ReportBlobId = @p3, @CompletedDate = @p4, @ErrorCode = @p5, @ImportReportVersion = @p6, @ExportReportVersion = @p7   </inputbuf>
        </process>
    </process-list>
    <resource-list>
        <pagelock fileid="13" pageid="52372905" dbid="11" subresource="FULL" objectname="database1.dbo.Batch" id="lock9f21680580" mode="IX" associatedObjectId="72060857777979392" databaseName="database1">
            <owner-list>
                <owner id="process2e950984e8" mode="IX" />
            </owner-list>
            <waiter-list>
                <waiter id="process11634c19088" mode="S" requestType="wait" />
            </waiter-list>
        </pagelock>
        <pagelock fileid="13" pageid="14044552" dbid="11" subresource="FULL" objectname="database1.dbo.Batch" id="lock1506883db00" mode="S" associatedObjectId="72060857777979392" databaseName="database1">
            <owner-list>
                <owner id="process11634c19088" mode="S" />
            </owner-list>
            <waiter-list>
                <waiter id="process2e950984e8" mode="IX" requestType="wait" />
            </waiter-list>
        </pagelock>
    </resource-list>
</deadlock>

Best Answer

Regarding your statement here:

My understanding and I just tested it, is that while reading tables, it would acquire shared lock on page/row level and then release it immediately

Here's how the documentation briefly describes READ COMMITTED:

Specifies that statements cannot read data that has been modified but not committed by other transactions. This prevents dirty reads. Data can be changed by other transactions between individual statements within the current transaction, resulting in nonrepeatable reads or phantom data. This option is the SQL Server default.

So your understanding is close to correct. It doesn't release the locks immediately - it releases locks as soon as possible. In this case, as soon as possible is the moment the lock on the next row or page is acquired. See Paul White's blog post The Read Committed Isolation Level:

When locking read committed acquires locks at row granularity, the shared lock taken on a row is released when a shared lock is taken on the next row. At page granularity, the shared page lock is released when the first row on the next page is read, and so on

Since the locks in the deadlock graph are page locks, I'd bet that the deadlock is happening because you have lots of small rows on each 8K page.

So essentially, the problem here is that you have one session that's SELECTing rows out of the dbo.Batch table (dbo.uspBatch_Search), and another session that's updating rows (dbo.uspBatch_Update). So this is what happens:

  1. spid 72 starts searching through dbo.Batch, acquiring (and releasing) shared (S) locks on a page at a time
  2. spid 234 starts updating rows in dbo.Batch on BatchId. My guess is this is doing updates one rows at a time, within one long transaction, holding locks until the end (it's used 10,880 bytes so far when the deadlock occurred)
  3. spid 72 runs into a page that has been locked by the update process in spid 234. So it has to stop where it is and wait, as it can't read data that's been modified by other sessions (and hasn't yet been committed)
  4. spid 234 keeps chugging along, until it hits the last row that's been S locked by the search process on spid 72 (which it's still holding, because it can't acquire the next S lock it needs). The IX lock that spid 234 is incompatible with the S lock, so it has to wait.

Thus the deadlock.

If you post execution plans and query text for these two procedures, we can probably help you with options to avoid the deadlock (with indexes or query rewrites).