Sql-server – Cannot insert duplicate key row on a non-unique index

sql serversql-server-2017update

We have encountered this strange error three times over the past few days, after being error free for 8 weeks, and I'm stumped.

This is the error message:

Executing the query "EXEC dbo.MergeTransactions" failed with the following error:
"Cannot insert duplicate key row in object 'sales.Transactions' with unique index
'NCI_Transactions_ClientID_TransactionDate'.
The duplicate key value is (1001, 2018-12-14 19:16:29.00, 304050920).".

The index we have is not unique. If you notice, the duplicate key value in the error message doesn’t even line up with the index. Strange thing is if I rerun the proc, it succeeds.

This is the most recent link I could find that has my issues but I don't see a solution.

https://www.sqlservercentral.com/forums/topic/error-cannot-insert-duplicate-key-row-in-a-non-unique-index

A couple things about my scenario:

  • The proc is updating the TransactionID (part of the primary key) – I think this is what is causing the error but don't know why? We'll be removing that logic.
  • Change tracking is enabled on the table
  • Doing transaction read uncommitted

There are 45 fields for each table, I mainly listed the ones used in indexes. I'm updating the TransactionID (clustered key) in the update statement (unnecessarily). Strange that we haven't had any issues for months until last week. And it's only happening sporadically via SSIS.

Table

USE [DB]
GO

/****** Object:  Table [sales].[Transactions]    Script Date: 5/29/2019 1:37:49 PM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND type in (N'U'))
BEGIN
CREATE TABLE [sales].[Transactions]
(
    [TransactionID] [bigint] NOT NULL,
    [ClientID] [int] NOT NULL,
    [TransactionDate] [datetime2](2) NOT NULL,
    /* snip*/
    [BusinessUserID] [varchar](150) NOT NULL,
    [BusinessTransactionID] [varchar](150) NOT NULL,
    [InsertDate] [datetime2](2) NOT NULL,
    [UpdateDate] [datetime2](2) NOT NULL,
 CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED 
(
    [TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [DB_Data]
) ON [DB_Data]
END
GO
USE [DB]

IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND name = N'NCI_Transactions_ClientID_TransactionDate')
begin
CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
    [ClientID] ASC,
    [TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE) ON [DB_Data]
END

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_Units]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD  CONSTRAINT [DF_Transactions_Units]  DEFAULT ((0)) FOR [Units]
END
GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_ISOCurrencyCode]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD  CONSTRAINT [DF_Transactions_ISOCurrencyCode]  DEFAULT ('USD') FOR [ISOCurrencyCode]
END
GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_InsertDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD  CONSTRAINT [DF_Transactions_InsertDate]  DEFAULT (sysdatetime()) FOR [InsertDate]
END
GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_UpdateDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD  CONSTRAINT [DF_Transactions_UpdateDate]  DEFAULT (sysdatetime()) FOR [UpdateDate]
END
GO

temporary table

same columns as the mgdata. including the relevant fields. Also has a non-unique clustered index
(
    [BusinessTransactionID] [varchar](150) NULL,
    [BusinessUserID] [varchar](150) NULL,
    [PostalCode] [varchar](25) NULL,
    [TransactionDate] [datetime2](2) NULL,

    [Units] [int] NOT NULL,
    [StartDate] [datetime2](2) NULL,
    [EndDate] [datetime2](2) NULL,
    [TransactionID] [bigint] NULL,
    [ClientID] [int] NULL,

) 

CREATE CLUSTERED INDEX ##workingTransactionsMG_idx ON #workingTransactions (TransactionID)

It is populated in batches (500k rows at a time), something like this
IF OBJECT_ID(N'tempdb.dbo.#workingTransactions') IS NOT NULL DROP TABLE #workingTransactions;
select fields 
into #workingTransactions
from import.Transactions
where importrowid between two number ranges -- pseudocode

Primary Key

 CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED 
(
    [TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [Data]
) ON [Data]

Non-clustered index

CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
    [ClientID] ASC,
    [TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE)

sample update statement

-- updates every field
update t 
set 
    t.transactionid = s.transactionid,
    t.[CityCode]=s.[CityCode],
      t.TransactionDate=s.[TransactionDate],
     t.[ClientID]=s.[ClientID],
                t.[PackageMonths] = s.[PackageMonths],
                t.UpdateDate = @UpdateDate
              FROM #workingTransactions s
              JOIN [DB].[sales].[Transactions] t 
              ON s.[TransactionID] = t.[TransactionID]
             WHERE CAST(HASHBYTES('SHA2_256 ',CONCAT( S.[BusinessTransactionID],'|',S.[BusinessUserID],'|', etc)
                <> CAST(HASHBYTES('SHA2_256 ',CONCAT( T.[BusinessTransactionID],'|',T.[BusinessUserID],'|', etc)

My question is, what is going on under the hood? And what is the solution? For reference, the link above mentions this:

At this point, I have a few theories:

  • Bug related to memory pressure or large parallel update plan, but I would expect a different type of error and so far I cannot correlate
    low resources will timeframe of these isolated and sporadic errors.
  • A bug in UPDATE statement or data is causing an actual duplicate violation on the primary key, but some obscure SQL Server bug is
    resulting in and error message that cites the wrong index name.
  • Dirty reads resulting from read uncommitted isolation causing a large parallel update to double insert. But ETL developers claim
    default read committed is used, and it's hard to determine exactly
    what isolation level the process is actually used at runtime.

I suspect that if I tweak the execution plan as a work-around, perhaps
MAXDOP (1) hint or using session trace flag to disable spool
operation, the error will just go away, but it's unclear how this
would impact performance

Version

Microsoft SQL Server 2017 (RTM-CU13) (KB4466404) – 14.0.3048.4 (X64)
Nov 30 2018 12:57:58
Copyright (C) 2017 Microsoft Corporation
Enterprise Edition (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393: )

Best Answer

My question is, what is going on under the hood? And what is the solution?

It is a bug. The problem is that it only happens occasionally, and will be tough to reproduce. Still, your best chance is to engage Microsoft support. Update processing is mind-bendingly complex, so this will require a very detailed investigation.

For an example of the sort of complexities involved, have a look at my posts MERGE Bug with Filtered Indexes and Incorrect Results with Indexed Views. Neither of those relate directly to your issue, but they do give a flavour.

Write a deterministic update

That's all rather generic of course. Perhaps more usefully, I can say that you should look to rewrite your current UPDATE statement. As the documentation says:

Use caution when specifying the FROM clause to provide the criteria for the update operation. The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.

Your UPDATE is not deterministic, and the results are therefore undefined. You should change it so that at most one source row is identified for each target row. Without that change, the result of the update may not reflect any individual source row.

Example

Let me show you an example, using tables loosely modelled on those given in the question:

CREATE TABLE dbo.Transactions
(
    TransactionID bigint NOT NULL,
    ClientID integer NOT NULL,
    TransactionDate datetime2(2) NOT NULL,

    CONSTRAINT PK_dbo_Transactions
        PRIMARY KEY CLUSTERED (TransactionID),

    INDEX dbo_Transactions_ClientID_TranDate
        (ClientID, TransactionDate)
);

CREATE TABLE #Working
(
    TransactionID bigint NULL,
    ClientID integer NULL,
    TransactionDate datetime2(2) NULL,

    INDEX cx CLUSTERED (TransactionID)
);

To keep things simple, put one row in the target table, and four rows in the source:

INSERT dbo.Transactions 
    (TransactionID, ClientID, TransactionDate)
VALUES 
    (1, 1, '2019-01-01');

INSERT #Working 
    (TransactionID, ClientID, TransactionDate)
VALUES 
    (1, 2, NULL),
    (1, NULL, '2019-03-03'),
    (1, 3, NULL),
    (1, NULL, '2019-02-02');

All four source rows match the target on TransactionID, so which one will be used if we run an update (like the one in the question) that joins on TransactionID alone?

UPDATE T
SET T.TransactionID = W.TransactionID,
    T.ClientID = W.ClientID,
    T.TransactionDate = W.TransactionDate
FROM #Working AS W
JOIN dbo.Transactions AS T
    ON T.TransactionID = W.TransactionID;

(Updating the TransactionID column is not important for the demo, you can comment it out if you like.)

The first surprise is that the UPDATE completes without an error, despite the target table not allowing nulls in any column (all the candidate rows contain a null).

The important point is that the result is undefined, and in this case produces an outcome that matches none of the source rows:

SELECT
    T.TransactionID,
    T.ClientID,
    T.TransactionDate
FROM dbo.Transactions AS T;
╔═══════════════╦══════════╦════════════════════════╗
║ TransactionID ║ ClientID ║    TransactionDate     ║
╠═══════════════╬══════════╬════════════════════════╣
║             1 ║        2 ║ 2019-03-03 00:00:00.00 ║
╚═══════════════╩══════════╩════════════════════════╝

db<>fiddle demo

More details: The ANY Aggregate is Broken

The update should be written such that it would succeed if written as the equivalent MERGE statement, which does check for attempts to update the same target row more than once. I do not generally recommend using MERGE directly, because it has been subject to so many implementation bugs, and normally has worse performance.

As a bonus, you may find that rewriting your current update to be deterministic will result in your occasional bug problem also going away. The product bug will still exist for people that write non-determinstic updates of course.