SQL Server 2012 – Why Denali Sequences Perform Better Than Identity Columns

sequencesql serversql-server-2012

In his answer to Which is better: identity columns or generated unique id values? mrdenny says:

When SQL Denali comes out it will support sequences which will be more efficient than identity, but you can't create something more efficient yourself.

I'm not so sure. Knowing Oracle's sequences, I have either to create a trigger for insert, encapsulate each insert into a call of a stored procedure, or pray that I do not forget to properly use the sequence when I do an ad-hoc insert.

I doubt that the advantages of sequences are so obvious.

Best Answer

I'll answer here as well. It has to do with the internals of how IDENTITY and SEQUENCE work.

With IDENTITY, SQL Server pre-caches values into memory so that they are readily available. See Martin Smith's answer for the details. As values are used, a background process generates more values. As you can imagine this pool can run out pretty quickly, leaving the application at the mercy of the background process that is generating the values.

With SEQUENCE, SQL Server allows you to define how large the cache should be. While SQL Server doesn't actually keep the values in the cache, it only keeps the current value and the top end value, this will greatly reduce the amount of IO that is needed to create values.

Don't set the cache too high, as this will reduce the number of numbers which can be used: if SQL Server were to crash, any values specified in the current cache range which weren't used would be lost.

As for row insertion, just specify a default value for the column, like so:

DEFAULT (NEXT VALUE FOR Audit.EventCounter),

Related Solutions

Sql-server – Why do sequential GUID keys perform faster than sequential INT keys in the test case

I modified @Phil Sandler's code to remove the effect of calling GETDATE() (there may be hardware effects/interrupts involved??), and made rows the same length.

[There have been several articles since SQL Server 2000 relating to timing issues and high-resolution timers, so I wanted to minimise that effect.]

In simple recovery model with data and log file both sized way over what is required, here are the timings (in seconds): (Updated with new results based on exact code below)

       Identity(s)  Guid(s)
       ---------    -----
       2.876        4.060    
       2.570        4.116    
       2.513        3.786   
       2.517        4.173    
       2.410        3.610    
       2.566        3.726
       2.376        3.740
       2.333        3.833
       2.416        3.700
       2.413        3.603
       2.910        4.126
       2.403        3.973
       2.423        3.653
    -----------------------
Avg    2.650        3.857
StdDev 0.227        0.204

The code used:

SET NOCOUNT ON

CREATE TABLE TestGuid2 (Id UNIQUEIDENTIFIER NOT NULL DEFAULT NEWSEQUENTIALID() PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(88))

CREATE TABLE TestInt (Id Int NOT NULL identity(1,1) PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))

DECLARE @Numrows INT = 1000000

CREATE TABLE #temp (Id int NOT NULL Identity(1,1) PRIMARY KEY, rowNum int, adate datetime)

DECLARE @LocalCounter INT = 0

--put rows into temp table
WHILE (@LocalCounter < @NumRows)
BEGIN
    INSERT INTO #temp(rowNum, adate) VALUES (@LocalCounter, GETDATE())
    SET @LocalCounter += 1
END

--Do inserts using GUIDs
DECLARE @GUIDTimeStart DateTime = GETDATE()
INSERT INTO TestGuid2 (SomeDate, batchNumber) 
SELECT adate, rowNum FROM #temp
DECLARE @GUIDTimeEnd  DateTime = GETDATE()

--Do inserts using IDENTITY
DECLARE @IdTimeStart DateTime = GETDATE()
INSERT INTO TestInt (SomeDate, batchNumber) 
SELECT adate, rowNum FROM #temp
DECLARE @IdTimeEnd DateTime = GETDATE()

SELECT DATEDIFF(ms, @IdTimeStart, @IdTimeEnd) AS IdTime, DATEDIFF(ms, @GUIDTimeStart, @GUIDTimeEnd) AS GuidTime

DROP TABLE TestGuid2
DROP TABLE TestInt
DROP TABLE #temp
GO

After reading @Martin's investigation, I re-ran with the suggested TOP(@num) in both cases, i.e.

...
--Do inserts using GUIDs
DECLARE @num INT = 2147483647; 
DECLARE @GUIDTimeStart DATETIME = GETDATE(); 
INSERT INTO TestGuid2 (SomeDate, batchNumber) 
SELECT TOP(@num) adate, rowNum FROM #temp; 
DECLARE @GUIDTimeEnd DATETIME = GETDATE();

--Do inserts using IDENTITY
DECLARE @IdTimeStart DateTime = GETDATE()
INSERT INTO TestInt (SomeDate, batchNumber) 
SELECT TOP(@num) adate, rowNum FROM #temp;
DECLARE @IdTimeEnd DateTime = GETDATE()
...

and here are the timing results:

       Identity(s)  Guid(s)
       ---------    -----
       2.436        2.656
       2.940        2.716
       2.506        2.633
       2.380        2.643
       2.476        2.656
       2.846        2.670
       2.940        2.913
       2.453        2.653
       2.446        2.616
       2.986        2.683
       2.406        2.640
       2.460        2.650
       2.416        2.720

    -----------------------
Avg    2.426        2.688
StdDev 0.010        0.032

I wasn't able to get the actual execution plan, as the query never returned! It seems a bug is likely. (Running Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64))

Sql-server – Are foreign keys better than indexes

Foreign Key Constraints and Indexes are two completely separate entities within a database. As you have explicitly stated in your question, foreign key relationships are to ensure Referential Integrity. If your database copy is soley for OLAP that wouldn't really come into play (as long as you aren't doing any DML batches against it).

The indexes are what are going to impact your query performance. And if the above is true (no INSERT, UPDATE, or DELETE), you shouldn't be too concerned with the overhead of index maintenance.

Because of this, you can wisely choose indexes in order to maximize performance. But without seeing more details on your data, and what already exists, generalized ideas are all that can be given to you.

EDIT: As per Grant Fritchey's blog post, there can be a performance gain due to foreign key constraints, as the query optimizer can omit certain unnecessary operations due to the referential aspects of the constraint. For more information, please see Grant's blog post: Do Foreign Key Constraints Help Performance?

Best Answer

Related Solutions

Sql-server – Why do sequential GUID keys perform faster than sequential INT keys in the test case

Sql-server – Are foreign keys better than indexes

Related Question