We have encountered this strange error three times over the past few days, after being error free for 8 weeks, and I'm stumped.
This is the error message:
Executing the query "EXEC dbo.MergeTransactions" failed with the following error: "Cannot insert duplicate key row in object 'sales.Transactions' with unique index 'NCI_Transactions_ClientID_TransactionDate'. The duplicate key value is (1001, 2018-12-14 19:16:29.00, 304050920).".
The index we have is not unique. If you notice, the duplicate key value in the error message doesn’t even line up with the index. Strange thing is if I rerun the proc, it succeeds.
This is the most recent link I could find that has my issues but I don't see a solution.
A couple things about my scenario:
- The proc is updating the TransactionID (part of the primary key) – I think this is what is causing the error but don't know why? We'll be removing that logic.
- Change tracking is enabled on the table
- Doing transaction read uncommitted
There are 45 fields for each table, I mainly listed the ones used in indexes. I'm updating the TransactionID (clustered key) in the update statement (unnecessarily). Strange that we haven't had any issues for months until last week. And it's only happening sporadically via SSIS.
Table
USE [DB]
GO
/****** Object: Table [sales].[Transactions] Script Date: 5/29/2019 1:37:49 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND type in (N'U'))
BEGIN
CREATE TABLE [sales].[Transactions]
(
[TransactionID] [bigint] NOT NULL,
[ClientID] [int] NOT NULL,
[TransactionDate] [datetime2](2) NOT NULL,
/* snip*/
[BusinessUserID] [varchar](150) NOT NULL,
[BusinessTransactionID] [varchar](150) NOT NULL,
[InsertDate] [datetime2](2) NOT NULL,
[UpdateDate] [datetime2](2) NOT NULL,
CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED
(
[TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [DB_Data]
) ON [DB_Data]
END
GO
USE [DB]
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND name = N'NCI_Transactions_ClientID_TransactionDate')
begin
CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
[ClientID] ASC,
[TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE) ON [DB_Data]
END
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_Units]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_Units] DEFAULT ((0)) FOR [Units]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_ISOCurrencyCode]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_ISOCurrencyCode] DEFAULT ('USD') FOR [ISOCurrencyCode]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_InsertDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_InsertDate] DEFAULT (sysdatetime()) FOR [InsertDate]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_UpdateDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_UpdateDate] DEFAULT (sysdatetime()) FOR [UpdateDate]
END
GO
temporary table
same columns as the mgdata. including the relevant fields. Also has a non-unique clustered index
(
[BusinessTransactionID] [varchar](150) NULL,
[BusinessUserID] [varchar](150) NULL,
[PostalCode] [varchar](25) NULL,
[TransactionDate] [datetime2](2) NULL,
[Units] [int] NOT NULL,
[StartDate] [datetime2](2) NULL,
[EndDate] [datetime2](2) NULL,
[TransactionID] [bigint] NULL,
[ClientID] [int] NULL,
)
CREATE CLUSTERED INDEX ##workingTransactionsMG_idx ON #workingTransactions (TransactionID)
It is populated in batches (500k rows at a time), something like this
IF OBJECT_ID(N'tempdb.dbo.#workingTransactions') IS NOT NULL DROP TABLE #workingTransactions;
select fields
into #workingTransactions
from import.Transactions
where importrowid between two number ranges -- pseudocode
Primary Key
CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED
(
[TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [Data]
) ON [Data]
Non-clustered index
CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
[ClientID] ASC,
[TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE)
sample update statement
-- updates every field
update t
set
t.transactionid = s.transactionid,
t.[CityCode]=s.[CityCode],
t.TransactionDate=s.[TransactionDate],
t.[ClientID]=s.[ClientID],
t.[PackageMonths] = s.[PackageMonths],
t.UpdateDate = @UpdateDate
FROM #workingTransactions s
JOIN [DB].[sales].[Transactions] t
ON s.[TransactionID] = t.[TransactionID]
WHERE CAST(HASHBYTES('SHA2_256 ',CONCAT( S.[BusinessTransactionID],'|',S.[BusinessUserID],'|', etc)
<> CAST(HASHBYTES('SHA2_256 ',CONCAT( T.[BusinessTransactionID],'|',T.[BusinessUserID],'|', etc)
My question is, what is going on under the hood? And what is the solution? For reference, the link above mentions this:
At this point, I have a few theories:
- Bug related to memory pressure or large parallel update plan, but I would expect a different type of error and so far I cannot correlate
low resources will timeframe of these isolated and sporadic errors.- A bug in UPDATE statement or data is causing an actual duplicate violation on the primary key, but some obscure SQL Server bug is
resulting in and error message that cites the wrong index name.- Dirty reads resulting from read uncommitted isolation causing a large parallel update to double insert. But ETL developers claim
default read committed is used, and it's hard to determine exactly
what isolation level the process is actually used at runtime.I suspect that if I tweak the execution plan as a work-around, perhaps
MAXDOP (1) hint or using session trace flag to disable spool
operation, the error will just go away, but it's unclear how this
would impact performance
Version
Microsoft SQL Server 2017 (RTM-CU13) (KB4466404) – 14.0.3048.4 (X64)
Nov 30 2018 12:57:58
Copyright (C) 2017 Microsoft Corporation
Enterprise Edition (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393: )
Best Answer
It is a bug. The problem is that it only happens occasionally, and will be tough to reproduce. Still, your best chance is to engage Microsoft support. Update processing is mind-bendingly complex, so this will require a very detailed investigation.
For an example of the sort of complexities involved, have a look at my posts MERGE Bug with Filtered Indexes and Incorrect Results with Indexed Views. Neither of those relate directly to your issue, but they do give a flavour.
Write a deterministic update
That's all rather generic of course. Perhaps more usefully, I can say that you should look to rewrite your current
UPDATE
statement. As the documentation says:Your
UPDATE
is not deterministic, and the results are therefore undefined. You should change it so that at most one source row is identified for each target row. Without that change, the result of the update may not reflect any individual source row.Example
Let me show you an example, using tables loosely modelled on those given in the question:
To keep things simple, put one row in the target table, and four rows in the source:
All four source rows match the target on
TransactionID
, so which one will be used if we run an update (like the one in the question) that joins onTransactionID
alone?(Updating the
TransactionID
column is not important for the demo, you can comment it out if you like.)The first surprise is that the
UPDATE
completes without an error, despite the target table not allowing nulls in any column (all the candidate rows contain a null).The important point is that the result is undefined, and in this case produces an outcome that matches none of the source rows:
db<>fiddle demo
More details: The ANY Aggregate is Broken
The update should be written such that it would succeed if written as the equivalent
MERGE
statement, which does check for attempts to update the same target row more than once. I do not generally recommend usingMERGE
directly, because it has been subject to so many implementation bugs, and normally has worse performance.As a bonus, you may find that rewriting your current update to be deterministic will result in your occasional bug problem also going away. The product bug will still exist for people that write non-determinstic updates of course.