SQL Server – How to Avoid IX Deadlock in Check Constraint

deadlocksql server

I have following situation, I have a table with a primary key, and a constraint that I cannot have rows with specific requirements. For demonstration purposes, here I have constraint that doesn't allow to insert duplicate values in N column. In real case, it checks several columns using foreign keys to other tables and with additional filters, so I cannot put simple unique constraint.
So here is example

create table dbo.T1 (
    Id int not null identity (1,1),
    N int not null
)

alter table dbo.T1
add primary key (Id);

go

create function [dbo].[fn_CheckN](@id int, @n int)
returns int
as 
begin
    if exists (select * from dbo.T1 t where t.n = @n and t.Id != @id)
        return 0

    return 1
end

go

alter table [dbo].T1 with nocheck add  constraint [CK_T1_Valid] check  (([dbo].[fn_CheckN]([Id],[N]) = 1))
go

alter table [dbo].T1 check constraint [CK_T1_Valid]
go

When I run concurrently

insert into dbo.T1 (N) 
values (@i)

I get this deadlock S -> X, X -> S on primary key. And that I kind of understand why. Deadlock xml:
https://pastebin.com/hceR3sum

My first attempt to fix this was to grab S lock first

begin tran 
declare @lock int = (select top(1) 1 from dbo.T1 with (tablock, holdlock))

insert into dbo.T1 (N) 
values (@i)

commit

But it failed with this deadlock S -> IX, IX -> S. Can somebody explain what is going on? Deadlock xml: https://pastebin.com/mLXJb59C.

And I fixed it with locking entire table with X lock. Is it ok? Is there a better approach?

begin tran 
declare @lock int = (select top(1) 1 from dbo.T1 with (tablockx, holdlock))

insert into dbo.T1 (N) 
values (@i)

commit

If I put index on N column I get this deadlock https://pastebin.com/KJGmLDhH. The real requirement is pretty much the same, the simplest case there are 4 columns with accountIds and enabled flag, and when something happens to the record I have to check that account id is unique among all enabled records. or if it's disabled that I have nothing to check. Something like that.

I use C# to run queries simultaneously

class Program
{
    private const string connectionString = "Server=.;Database=Performance;Trusted_Connection=True;MultipleActiveResultSets=True; Max Pool Size=3000";

    static async Task Main(string[] args)
    {
        await ClearAsync();
        await Task.WhenAll(Enumerable.Range(0, 100000).Select(async i => await InsertAsync(i)).ToArray()); 
        Console.WriteLine("Done");
        Console.ReadKey();
    }

    public static async Task InsertAsync(int i)
    {
        using var connection = new SqlConnection(connectionString);
        using var cmd = new SqlCommand(@"
insert into dbo.T1 (N) 
values (@i)
", connection);
        await connection.OpenAsync();
        cmd.Parameters.Add("@i", SqlDbType.Int).Value = i;
        await cmd.ExecuteNonQueryAsync();
    }

    public static async Task ClearAsync()
    {
        using var connection = new SqlConnection(connectionString);
        using var cmd = new SqlCommand("delete from dbo.T1", connection);
        await connection.OpenAsync();
        await cmd.ExecuteNonQueryAsync();
    }
}

I use Microsoft SQL Server Express (64-bit) 13.0.1601.5

Best Answer

If you don't provide an index on column N, SQL Server cannot check if the target value exists efficiently, and must scan the table. The scan continues until a match is found, or scans the entire table if no match exists (your target case). This is terribly inefficient, and happens per row.

At the default locking read committed isolation level, the scan also usually means acquiring and releasing shared locks on every row while they are tested for a match.

Depending on the order in which rows are inserted (and so exclusively locked) on different connections, the competing activities will lead to extensive shared-exclusive blocking or deadlocking.

So you need a nonclustered index on N, such as:

CREATE NONCLUSTERED INDEX [IX dbo.T1 N]
ON dbo.T1 (N);

With an index

You may still encounter deadlocks when the index is present (as you have seen). This is because when the table is small, SQL Server may choose to scan the nonclustered index rather than seeking to the desired value of N and then seeking for id > @id OR id < @id.

This can lead to blocking or deadlocks in a similar way to the original case. The exact interactions are a little more complex because SQL Server will normally choose to insert the new row in the clustered index and nonclustered index before checking the constraint using the scalar function. (It is possible to avoid this in the demo scenario by forcing SQL Server to insert into the nonclustered index after checking the constraint, but I don't want to get into that.)

A workaround

Let me just say that all but the most expert practitioners should avoid enforcing check constraints with scalar functions. There are just too many quirks and hidden gotchas. You will likely end up with invalid data when constraints don't work as expected, or indeed even fire at all.

Bear in mind that most people think they are a great deal more expert than is truly the case. That said, and for educational value only, it is possible to avoid the deadlock scenario in the demo by forcing seek access on the nonclustered index.

I have also added a READCOMMITTEDLOCK hint because validating constraint must be done using the most recent committed values, not row versions. The demo fails to ensure uniqueness when read committed snapshot isolation or snapshot isolation is in use, because the function may read out-of-date data.

CREATE FUNCTION dbo.fn_CheckN
(   
    @id integer, 
    @n integer
)
RETURNS integer
WITH SCHEMABINDING
AS
BEGIN
    RETURN
        CASE WHEN EXISTS 
            (
                SELECT 1
                FROM dbo.T1 AS T 
                    WITH (READCOMMITTEDLOCK, FORCESEEK)
                WHERE T.N = @n
                AND T.Id != @id
            )
            THEN 0
            ELSE 1
        END;
END;

This is still not robust code, and not fit for production use. It does address the issue shown in the question.

Depending on the real requirements, you might be able to use triggers, denormalized constraints, or an indexed view instead.

Other remarks

  • HOLDLOCK does not mean hold locks. SQL Server always holds locks long enough to guarantee correctness given the query specification and configuration settings (including isolation level). HOLDLOCK is a synonym for SERIALIZABLE. It is an advanced option for expert users to specify serializable isolation semantics for a particular object, while using a different level on other objects in the same transaction.
  • TABLOCK ensures only object-level locks are taken. TABLOCKX ensures only exclusive object-level locks are taken. Both will tend to completely serialize access to the specified object, which is great for avoiding contention — but only because you now have no concurrency at all.
  • The clustered index identity pattern will create a very hot page at the end of the base table. Concurrent inserts will all be targeting that one page, resulting in latch contention, and a reduction in throughput.
  • Your demo code specifies MARS but you are not using it. Instead, you create a new connection for each single insert. I realize it is just a quick test rig, but still.

Examples of scalar function check constraints not behaving as expected (some links are quite old, but the described behaviour is still current):

Note also that scalar functions used in check constraints are not eligible for inlining on SQL Server 2019+.