SQL Server 2017 – Deadlocks from Simultaneous Bulk Inserts with Identity Column PK

bulk-insertdeadlocksql serversql-server-2017

When analyzing a deadlock graph, it looks like the victim is a bulk insert, and the blockers are the same bulk insert. Using a message bus, many subscribers can end up doing a bulk insert at roughly the same time. The deadlock usually happens, but not always.

I think these simultaneous bulk inserts are the reason why the deadlock is occurring. But a colleague mentioned it shouldn't be possible with my setup. The bulk inserts are inserting into the same table. And that table's PK is an identity column. Should it be possible for deadlocks to occur in this scenario? They are done within a transaction.

This is the line of code that does the inserts:

await bulkCopy.WriteToServerAsync(dataTableWithEnumStrings);

This is from the deadlock graph. Both the victim and blockers show the same SQL statements.

Victim:

enter image description here

Blockers:

enter image description here

SQL:

enter image description here

Here is the bulk copy code:

    using (SqlBulkCopy bulkCopy = new SqlBulkCopy(sqlConnection, SqlBulkCopyOptions.Default, sqlTransaction))
    {
        bulkCopy.BulkCopyTimeout = 0;
        bulkCopy.BatchSize = 10000;
        bulkCopy.DestinationTableName = destinationTableName;

        var dataTable = ToDataTable(histories, columnNames);
        var dataTableWithEnumStrings = ConvertDataTableEnum(dataTable);

        // Add column mappings so we don't have to worry about order when adding new columns/properties.
        foreach (DataColumn column in dataTableWithEnumStrings.Columns)
        {
            // The column mappings are case sensitive, so grab the destination column so we can use its casing.
            string destColumn = columnNames.Single(x => x.Equals(column.ColumnName, StringComparison.OrdinalIgnoreCase));
            bulkCopy.ColumnMappings.Add(column.ColumnName, destColumn);
        }

        await bulkCopy.WriteToServerAsync(dataTableWithEnumStrings);
    }

The SentryOne Plan Explorer diagram:

enter image description here

Here is the XML (XDL)

From what I can tell, it looks like a page lock. So maybe just too many inserts at once?

The number of rows can vary, but the high end might be something like 4,000.

Here is the table's schema, generated from a create script:

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[AccrualHistory](
    [ID] [bigint] IDENTITY(1,1) NOT NULL,
    [CompanyID] [int] NOT NULL,
    [EmployeeID] [bigint] NOT NULL,
    [AccrualID] [bigint] NOT NULL,
    [ChangeAmount] [decimal](12, 6) NOT NULL,
    [ProcessingDateTime] [datetime] NOT NULL,
    [AppliedDate] [date] NOT NULL,
    [ActionApplication] [varchar](20) NOT NULL,
    [ActionDescription] [varchar](300) NOT NULL,
    [LastChangeDate] [datetime2](7) NULL,
    [LastChangeUserID] [bigint] NOT NULL,
    [FrequencyType] [char](1) NOT NULL,
    [ServerName] [varchar](100) NOT NULL,
    [ApplicationName] [varchar](100) NOT NULL,
    [CalculationID] [uniqueidentifier] NULL,
    [CalendarID] [uniqueidentifier] NULL,
    [IncludedInBalance] [bit] NOT NULL,
 CONSTRAINT [PK_AccrualHistory_ID] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[AccrualHistory] ADD  CONSTRAINT [dft_AccrualHistory_LastChangeDate]  DEFAULT (getdate()) FOR [LastChangeDate]
GO

ALTER TABLE [dbo].[AccrualHistory] ADD  CONSTRAINT [dft_AccrualHistory_LastChangeUserID]  DEFAULT ((0)) FOR [LastChangeUserID]
GO

ALTER TABLE [dbo].[AccrualHistory] ADD  CONSTRAINT [DF_AccrualHistory_FrequencyType]  DEFAULT ('') FOR [FrequencyType]
GO

ALTER TABLE [dbo].[AccrualHistory] ADD  CONSTRAINT [DF_AccrualHistory_ServerName]  DEFAULT ('') FOR [ServerName]
GO

ALTER TABLE [dbo].[AccrualHistory] ADD  CONSTRAINT [DF_AccrualHistory_ApplicationName]  DEFAULT ('') FOR [ApplicationName]
GO

ALTER TABLE [dbo].[AccrualHistory] ADD  DEFAULT ((1)) FOR [IncludedInBalance]
GO

I just noticed the table's PK has this:

ALLOW_PAGE_LOCKS = ON

To be honest, I don't know enough to know if I should alter that.

And here is the non-clustered index that's also on the table:

CREATE NONCLUSTERED INDEX [IX_AccrualHistory_EmployeeID_AccrualID] ON [dbo].[AccrualHistory]
(
    [EmployeeID] ASC,
    [AccrualID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

SQL Server version: 14.0.3192.2

DB recovery model: FULL

Best Answer

I don't know why this is deadlocking, and I can't reproduce it. But like a lot of deadlock scenarios it's not necessary to fully understand them to remediate them.

A deadlock is caused by locking to little, and too late, and they can typically be resolved by forcing an earlier and more exclusive lock. And they best way to handle explicit locking in SQL Server is with sp_getapplock. This will force your bulk load sessions to serialize, and load one-at-a-time without relying on the normal row/page locking to do this. Since all your new rows have to go at the end of the index, the loads can't really run in parallel to begin with.

Before calling SqlBulkCopy, run this on your SqlTransaction:

static void GetAppLock(string name, SqlTransaction tran)
{
    var cmd = new SqlCommand("sp_getapplock", tran.Connection);
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.Transaction = tran;
    var pResource = cmd.Parameters.Add(new SqlParameter("@Resource", SqlDbType.NVarChar, 255));
    pResource.Value = name;

    var pLockMode = cmd.Parameters.Add(new SqlParameter("@LockMode", SqlDbType.VarChar, 32));
    pLockMode.Value = "Exclusive";

    cmd.ExecuteNonQuery();
}

Or since the target table is not a heap, you should get exclusive access with SqlBulkCopyOptions.TableLock. But be aware that if you're not in READ COMMITTED SNAPSHOT mode that will block all reads to the table, and in any case it will block other non-bulk writes.