Sql-server – duplicate key error creating unique index on temp, SQL Server 12.0.5589 bug

sql server

I have a job that runs a stored procedure, that seems to intermittently fail with a duplicate key error when creating a unique clustered index on a temp table. How is it possible for me to get this error based on the code below.

It seems to happen when the source table is being updated. Transaction Isolation Level is set to read uncommitted, and there ARE some triggers on "SomeTableWithPrimaryKey". It doesn't happen every time, just three times in the last 100 runs. There are about 3 million rows in the source table "SomeTableWithPrimaryKey", and about 500k of them are being selected after the filters are applied.

Is this a SQL Server Bug? Some race condition?

This statement is based on a true query, the names have been changed to protect the innocent…and my job. 🙂

--on SQL Server 12.0.5589
SELECT  p.PrimaryKeyColumn, p.SomeOtherColumn
INTO    #Temp
FROM    dbo.SomeTableWithPrimaryKey p
WHERE   p.SomeCondition = 1
        AND (p.SomeCondition2 = 2   
            OR EXISTS(
                SELECT 1
                FROM    dbo.SomeTable st
                        INNER JOIN dbo.SomeOtherTable sot
                            ON st.Column1 = sot.Column1
                WHERE   st.SomeCondition1 = 1
                        AND st.PrimaryKeyColumn = p.PrimaryKeyColumn
                )
            )
;

--productid index
CREATE UNIQUE CLUSTERED INDEX CX_Temp ON #Temp(PrimaryKeyColumn)
;

^this is where we get the error:

The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.#Temp______…000000D5A436' and the index name 'CX_Temp'. The duplicate key value is (123456789).
[SQLSTATE 23000] (Error 1505) The statement has been terminated.
[SQLSTATE 01000] (Error 3621). The step failed.

Best Answer

When using the READ UNCOMMITTED isolation level, it is possible for SQL Server to read the same data twice - causing you to get duplicate results, even if there is a unique constraint ensuring there are no duplicates of that data in the table.

Looking at some of the comments on the question:

Thanks, the general consensus seems to be the iso level, but keep in mind, i've probably used that same basic query above in hundreds of places, and have never seen this behavior before

This won't happen all the time, and can depend a lot on whether the source table is being updated, the physical operations SQL Server chooses to use to read the data, and the nature of the query being run.

I will change the iso level and watch it (for possibly a year) to see if we get the same error in the future

That's great, changing the isolation level is your best option for preventing this problem!

When you have time, there is some great information out there on the issues around using the READ UNCOMMITTED isolation level:

Paul White's post (in a series about isolation levels): The Read Uncommitted Isolation Level
A blog post from Brent Ozar with a very clear demo of how results can change under READ UNCOMMITTED: Using NOLOCK? Here’s How You’ll Get the Wrong Query Results.

Related Solutions

Sql-server – Unique index corrupted SQL. Select query returns single row but create unique index fails

if you consider error, Msg 1505, Level 16, State 1, Line 2 The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.MSmerge_contents' and the index name 'uc1SycContents'. The duplicate key value is (7696031, 08703987-557d-e111-9888-e61f13c44f03)...... I am running below Query
select * from msmerge_contents where rowguid='08703987-557d-e111-9888-e61f13c44f03'
and it is returning only 1 row

When you have index corruption problems (ie. keys present in NC index but not in base table or vice-versa) you must be very careful about the SQL you use to validate data. At this moment your data is inconsistent but the query optimizer does not know that and completely trusts your schema, including these incorrect indexes. As such it may optimize your query to use one of the NC indexes that is missing a key and the result will also miss a a key falsely returning no duplicates. To solve this catch-22 situation you need to force the optimizer hand by explicitly requesting an index or another and make sure the projected list of columns can be satisfied by the index you enforced (ie. no *). Assuming uc1SycContents is not the clustered index, try out the following:

select rowguid
from msmerge_contents with INDEX (1)
where rowguid='08703987-557d-e111-9888-e61f13c44f03';

select rowguid
from msmerge_contents with INDEX ([uc1SycContents])
where rowguid='08703987-557d-e111-9888-e61f13c44f03';

This will forcefully check if the rowguid has a duplicate for that guid in the base table clustered index (index id 1) vs. the index uc1SycContents. I expect that the first query returns 2 (or more) rows while the second returns 1.

Sql-server – How does SQL Server choose an index key for a foreign key reference

The (lack of) documentation suggests that this behaviour is an implementation detail, and is therefore undefined and subject to change at any time.

This is in stark contrast to CREATE FULLTEXT INDEX, where you have to specify the name of an index to attach to -- AFAIK, there is no undocumented FOREIGN KEY syntax to do the equivalent (though theoretically, there could be in the future).

As mentioned, it does make sense that SQL Server chooses the smallest physical index with which to associate the foreign key. If you change the script to create the unique constraint as CLUSTERED, the script "works" on 2008 R2. But that behaviour is still undefined and should not be relied upon.

As with most legacy applications, you'll just have to get down to the nitty-gritty and clean things up.

Best Answer

Related Solutions

Sql-server – Unique index corrupted SQL. Select query returns single row but create unique index fails

Sql-server – How does SQL Server choose an index key for a foreign key reference

Related Question