Sql-server – sql server deadlock error – Shared (S) and Intent-Exclusive (IX) locks under Serializable isolation

deadlocksql server

Deadlock errors such as below happen frequently on our 3rd party vendor database:

<?xml version="1.0" encoding="utf-8" ?>
- <deadlock-list>
  - <deadlock victim="process743288">
    - <process-list>
      - <process id="process743288" taskpriority="0" logused="575444" waitresource="OBJECT: 13:1525580473:0" waittime="4022" ownerId="354957547" transactionname="user_transaction" lasttranstarted="2015-09-01T11:30:50.393" XDES="0x244baee90" lockMode="IX" schedulerid="3" kpid="2296" status="suspended" spid="98" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2015-09-01T11:30:50.393" lastbatchcompleted="2015-09-01T11:30:50.393" clientapp="Enterprise Vault" hostname="EV01" hostpid="6520" loginname="ABC\evs" isolationlevel="serializable (4)" xactid="354957547" currentdb="13" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">
        - <executionStack>
          <frame procname="EVVSJournalStore.dbo.uspci_AggregateReports" line="112" stmtstart="10420" stmtend="10486" sqlhandle="0x03000d00ef466b2aae1aac0000a500000100000000000000">DELETE FROM ArchiveSummary</frame>
        </executionStack>
        <inputbuf>Proc [Database Id = 13 Object Id = 711673583]</inputbuf>
      </process>
      - <process id="process3cfdc8" taskpriority="0" logused="601032" waitresource="OBJECT: 13:1525580473:0" waittime="4022" ownerId="354957492" transactionname="user_transaction" lasttranstarted="2015-09-01T11:30:50.387" XDES="0x182946e90" lockMode="IX" schedulerid="1" kpid="2684" status="suspended" spid="99" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2015-09-01T11:30:50.387" lastbatchcompleted="2015-09-01T11:30:50.383" clientapp="Enterprise Vault" hostname="EV01" hostpid="6280" loginname="ABC\evs" isolationlevel="serializable (4)" xactid="354957492" currentdb="13" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">
        - <executionStack>
          <frame procname="EVVSJournalStore.dbo.uspci_AggregateReports" line="112" stmtstart="10420" stmtend="10486" sqlhandle="0x03000d00ef466b2aae1aac0000a500000100000000000000">DELETE FROM ArchiveSummary</frame>
        </executionStack>
        <inputbuf>Proc [Database Id = 13 Object Id = 711673583]</inputbuf>
      </process>
    </process-list>
    - <resource-list>
      - <objectlock lockPartition="0" objid="1525580473" subresource="FULL" dbid="13" objectname="EVVSJournalStore.dbo.ArchiveSummary" id="locke716c680" mode="S" associatedObjectId="1525580473">
        - <owner-list>
          <owner id="process3cfdc8" mode="S" />
        </owner-list>
        - <waiter-list>
          <waiter id="process743288" mode="IX" requestType="convert" />
        </waiter-list>
      </objectlock>
      - <objectlock lockPartition="0" objid="1525580473" subresource="FULL" dbid="13" objectname="EVVSJournalStore.dbo.ArchiveSummary" id="locke716c680" mode="S" associatedObjectId="1525580473">
        - <owner-list>
          <owner id="process743288" mode="S" />
        </owner-list>
        - <waiter-list>
          <waiter id="process3cfdc8" mode="IX" requestType="convert" />
        </waiter-list>
      </objectlock>
    </resource-list>
  </deadlock>
</deadlock-list>

This is the relevant code within uspci_AggregateReports that causes the issue:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
--All records will be moved out of the ArchiveSummary table in a 
--serializable transaction to prevent updates to the table during consolidation. 
--When the data is moved out of ArchiveSummary into the cache table, ArchiveSummary 
--is truncated and the transaction committed to enable normal activities on the table 

INSERT ArchiveSummary_Cache
       ([VaultIdentity]
       ,[ArchivePointIdentity]
       ,[SavesetIdentity]
       ,[ArchivedDate]
       ,[OriginalSize]
       ,[ArchivedItemSize]
       ,[FileTypeID]
       ,[RefCount]
       ,[IsFSA]
       ,[MissingOriginalSizeRefCount]
       ,[IncludeInTimeSummaries]
       ,[ContentProviderIdentity]
       ,[RoundedOriginalSize])
SELECT [VaultIdentity]
      ,[ArchivePointIdentity]
      ,[SavesetIdentity]
      ,[ArchivedDate]
      ,[OriginalSize]
      ,[ArchivedItemSize]
      ,[FileTypeID]
      ,[RefCount]
      ,[IsFSA]
      ,[MissingOriginalSizeRefCount]
      ,[IncludeInTimeSummaries]
      ,[ContentProviderIdentity]
      ,[RoundedOriginalSize]
FROM ArchiveSummary

SELECT @retVal = @@ERROR,@totalSourceRecords = @@ROWCOUNT

IF( @retVal = 0)
BEGIN
    DELETE FROM  ArchiveSummary         
    SELECT @retVal = @@ERROR
    IF( @retVal = 0)
    BEGIN
        COMMIT TRANSACTION
        SET @CurrentCount = 0
    END
    ELSE
        ROLLBACK TRANSACTION
END
ELSE
BEGIN
    ROLLBACK TRANSACTION
    RETURN @retVal
END

Full SP here

In what order are the locks being taken ?
Is there any way to avoid these deadlocks ? Adding CPU, Memory, Indexes, or Locking granularity ?

Best Answer

I agree with @Kin - why is Enterprise Vault running the same query at the same time from two different sessions? Is it possible you have more copies of EV running than you thought?

Here is how SQL Sentry Plan Explorer PRO visualizes the deadlock (click to enlarge):

click to enlarge

Yes, they are both stuck trying to delete from ArchiveSummary. It's possible this is because there are other locks taken in an outer transaction that we can't see (or because EV is issuing it's own client-side transaction scope).

But before you even start to troubleshoot the code itself, I'd be interested in figuring out why this stored procedure is being called multiple times concurrently in the first place. That seems problematic no matter how you eliminate the deadlocks.