How SQL Server 2012 Defines New Identities

database-internalsidentitysql serversql-server-2012

On Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64), is there a way to see what mechanism SQL Server uses to calculate the new identity values for tables created by SELECT INTO?


SAMPLE DATA

-- Create our base table
CREATE TABLE dbo.A
(A_ID INT IDENTITY(1, 1),
x1 INT,
noise1 int DEFAULT 1,
noise2 char(1) DEFAULT 'S',
noise3 date DEFAULT GETUTCDATE(),
noise4 bit DEFAULT 0);

-- Create random data between the range of [0-3]
INSERT INTO dbo.A(x1)
SELECT s1000.n FROM
( SELECT TOP (10) n = 1 FROM sys.columns) AS s10 -- 10
CROSS JOIN
( SELECT TOP (10) n = 1 FROM sys.columns) AS s100 -- 10 * 10
CROSS JOIN
( SELECT TOP (10) n = ABS(CHECKSUM(NEWID())) % 4 FROM sys.columns) AS s1000; -- 100 * 10

SELECT * FROM dbo.A ORDER BY A_ID DESC;

In my case, the results are:

dbo.A results

And running

DBCC CHECKIDENT('A')

returns

Checking identity information: current identity value '1000', current
column value '1000'.


Create new table

When we select a subset into a new table, it creates a new current identity value:

SELECT * INTO #temp FROM dbo.A WHERE x1 = 0;

The highest value in the identity column of this new table is 998 and when we check it's next identity:

DBCC CHECKIDENT('#temp')

Checking identity information: current identity value '998', current
column value '998'.


How?

How are these identity values inserted into the new table with the highest value correctly set as the current identity value of the new table?

Best Answer

SELECT INTO has two phases (not visible in execution plans).

First, it creates a table that matches the metadata of the query used to create it. This happens in a system transaction, so the (empty) created table will continue to exist even if the SELECT INTO is wrapped in a user transaction that is rolled back. At the end of the first phase, we have an empty table that includes a column named A_ID with the identity property.

The second phase performs an insert into the table the first phase created. This insert is done with identity_insert semantics, so the identity values from the source table end up in the destination, unchanged. The highest value actually inserted is set as the last value used. You can use IDENT_CURRENT or sys.indentity_columns to see it.

Unrelated, but semi-interesting fact: There is a small window between creating the table and starting to perform the insert, where the source table in the query is not protected by a schema stability lock. If a concurrent process happens to change the schema of the source table (after the target has been created, but before the insert starts), error 539 is raised:

Schema changed after the target table was created. Rerun the Select Into query.