Sql-server – Why are timestamps not always increasing with concurrent inserts

concurrencysql serversql-server-2008-r2timestamp

I'm seeing some unexpected behavior with timestamp (rowversion) columns . I created a test table:

create table Test
(
    Test_Key int identity(1,1) primary key clustered,
    Test_Value int,
    Test_Thread int,
    ts timestamp
)

create nonclustered index IX_Test_Value on Test (Test_Value) -- probably irrelevant

I started two threads running inserts into this table at the same time. The first thread is running the following code:

declare @i int = 0
while @i < 100
begin
    insert into Test (Test_Value, Test_Thread) select n, 1 from dbo.fn_GenerateNumbers(10000)
    set @i = @i + 1
end

The second thread is running identical code, except that it is doing select n, 2 from the function to insert its thread ID.

First, a word about the function. This uses a series of cross-joined common table expressions with a ROW_NUMBER() to return a lot of numbers in sequence very quickly. I learned this trick from an article by Itzik Ben-Gan, so credit goes to him for it. I don't think the implementation of the function matters, but I will include it anyway:

CREATE FUNCTION dbo.fn_GenerateNumbers(@count int)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
    WITH
        Nbrs_4( n ) AS ( SELECT 1 UNION SELECT 0 ),
        Nbrs_3( n ) AS ( SELECT 1 FROM Nbrs_4 n1 CROSS JOIN Nbrs_4 n2 ),
        Nbrs_2( n ) AS ( SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2 ),
        Nbrs_1( n ) AS ( SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2 ),
        Nbrs_0( n ) AS ( SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2 ),
        Nbrs  ( n ) AS ( SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2 )

    SELECT n
    FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n) FROM Nbrs ) D ( n )
    WHERE n <= @count ;

This table has an identity column on it. I expected that when I selected the values from the table by this monotonically increasing primary key, I would see the timestamps in the same order, too. The timestamps might not be sequential, because there might have been other updates, but they would at least be in order.

However, what I am seeing is different. The inserts are interleaving by primary key, but the timestamps are sequential by thread.

Test_Key Test_Value Test_Thread ts
-------- ---------- ----------- ------------------
20227    227        1           0x000000006EDF3BC5
20228    228        1           0x000000006EDF3BC6
20229    229        1           0x000000006EDF3BC7
20230    230        1           0x000000006EDF3BC8
20231    1          2           0x000000006EDF41E9 -- thread 2 starts with a new ts
20232    2          2           0x000000006EDF41EB
20233    3          2           0x000000006EDF41EC
20234    4          2           0x000000006EDF41ED
--<snip lots of thread 2 inserts>
21538    1308       2           0x000000006EDF4710
21539    1309       2           0x000000006EDF4711
21540    1310       2           0x000000006EDF4712
21541    1311       2           0x000000006EDF4713
21542    231        1           0x000000006EDF3BC9 -- This is less than the prior row!
21543    232        1           0x000000006EDF3BCA -- Thread 1 is inserting
21544    233        1           0x000000006EDF3BCB -- from its last ts value
21545    234        1           0x000000006EDF3BCC

My question is:

1) Why is the timestamp not always increasing with concurrent inserts?

Bonus points if you can answer this question:

2) Why are the concurrent inserts overlapping the primary key instead of all being inserted at once? Each insert is running its own implicit transaction, so I expected the primary keys to be in order for a single thread's insert. I did not expect the primary keys to be interleaved.

I don't know enough about replication to answer this one:

3) Do having timestamps out of order cause a problem with replication? In the above example, what if thread 2 commits its data first? When thread 1 completes, its timestamps are all lower than the records inserted by thread 2.

I peeked at the executing requests and verified they are not running parallel, so I don't think parallelism is the problem.

Note that this query was running in the default (READ COMMITTED) isolation level. If I increase the isolation level to SERIALIZABLE, I still get timestamps in reverse order when threads change.

I am testing this on SQL Server 2008 R2.

To check the timestamp orders, I was doing a select * from Test, and I was also using the following queries:

-- find timestamps out of sequential order
select t1.*, t2.*
from Test t1
    inner join Test t2
        on t2.Test_Key = t1.Test_Key + 1
where
    t2.ts <> t1.ts + 1

-- find timestamps that are less than the prior timestamp
select t1.*, t2.*
from Test t1
    inner join Test t2
        on t2.Test_Key = t1.Test_Key + 1
where
    t2.ts < t1.ts

Best Answer

The IDENTITY generator is not well documented. There are some behaviors however that can be observed that seem relevant:

The identity generation does not get affected by transactions. That means once a value has been used it will not be reused, even if the transaction causing its use is rolled back.
Not every use causes an update of the sequence position being written back to the database. You can see that for example after a crash. Often the next used value after a crash is several numbers higher than the previous.

While there is no proof (meaning documentation), it can be assumed that for performance reasons a multi-row insert grabs a block of identity values and uses them until it runs out. Another concurrent thread will get the next block of numbers. At this point the identity value does not actually reflect the order of inserts anymore.

The rowversion data type on the other hand is an ever increasing number that would reflect insert order. (timestamp is a deprecated synonym for rowversion.)

So in your case you can assume that the rows were inserted in the order of the rowversion column and that the out-of-order identity value is caused by in memory performance optimizations.

By the way, while the IDENTITY generator is not very well documented, the new 2012 SEQUENCE functionality is. Here you can read all about the behaviors described above in sequences.

As for your concern with replication:

Transactional replication is using the database log and does not rely on specific column values.
Merge replication uses a rowguid column to identify a row. This is a column that gets valued once and does not change throughout the life of the row. Merge replication does not use a rowversion column. Transactional consistency is enforced by the fact that at the time of a synchronization, normal locking is used, so a transaction is either completely visible to the merge agent or completely invisible.
Snapshot replication does not look for changes at all. It just takes the at the time of the synchronization committed data and copies it over.

Related Solutions

Sql-server – Why do sequential GUID keys perform faster than sequential INT keys in the test case

I modified @Phil Sandler's code to remove the effect of calling GETDATE() (there may be hardware effects/interrupts involved??), and made rows the same length.

[There have been several articles since SQL Server 2000 relating to timing issues and high-resolution timers, so I wanted to minimise that effect.]

In simple recovery model with data and log file both sized way over what is required, here are the timings (in seconds): (Updated with new results based on exact code below)

       Identity(s)  Guid(s)
       ---------    -----
       2.876        4.060    
       2.570        4.116    
       2.513        3.786   
       2.517        4.173    
       2.410        3.610    
       2.566        3.726
       2.376        3.740
       2.333        3.833
       2.416        3.700
       2.413        3.603
       2.910        4.126
       2.403        3.973
       2.423        3.653
    -----------------------
Avg    2.650        3.857
StdDev 0.227        0.204

The code used:

SET NOCOUNT ON

CREATE TABLE TestGuid2 (Id UNIQUEIDENTIFIER NOT NULL DEFAULT NEWSEQUENTIALID() PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(88))

CREATE TABLE TestInt (Id Int NOT NULL identity(1,1) PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))

DECLARE @Numrows INT = 1000000

CREATE TABLE #temp (Id int NOT NULL Identity(1,1) PRIMARY KEY, rowNum int, adate datetime)

DECLARE @LocalCounter INT = 0

--put rows into temp table
WHILE (@LocalCounter < @NumRows)
BEGIN
    INSERT INTO #temp(rowNum, adate) VALUES (@LocalCounter, GETDATE())
    SET @LocalCounter += 1
END

--Do inserts using GUIDs
DECLARE @GUIDTimeStart DateTime = GETDATE()
INSERT INTO TestGuid2 (SomeDate, batchNumber) 
SELECT adate, rowNum FROM #temp
DECLARE @GUIDTimeEnd  DateTime = GETDATE()

--Do inserts using IDENTITY
DECLARE @IdTimeStart DateTime = GETDATE()
INSERT INTO TestInt (SomeDate, batchNumber) 
SELECT adate, rowNum FROM #temp
DECLARE @IdTimeEnd DateTime = GETDATE()

SELECT DATEDIFF(ms, @IdTimeStart, @IdTimeEnd) AS IdTime, DATEDIFF(ms, @GUIDTimeStart, @GUIDTimeEnd) AS GuidTime

DROP TABLE TestGuid2
DROP TABLE TestInt
DROP TABLE #temp
GO

After reading @Martin's investigation, I re-ran with the suggested TOP(@num) in both cases, i.e.

...
--Do inserts using GUIDs
DECLARE @num INT = 2147483647; 
DECLARE @GUIDTimeStart DATETIME = GETDATE(); 
INSERT INTO TestGuid2 (SomeDate, batchNumber) 
SELECT TOP(@num) adate, rowNum FROM #temp; 
DECLARE @GUIDTimeEnd DATETIME = GETDATE();

--Do inserts using IDENTITY
DECLARE @IdTimeStart DateTime = GETDATE()
INSERT INTO TestInt (SomeDate, batchNumber) 
SELECT TOP(@num) adate, rowNum FROM #temp;
DECLARE @IdTimeEnd DateTime = GETDATE()
...

and here are the timing results:

       Identity(s)  Guid(s)
       ---------    -----
       2.436        2.656
       2.940        2.716
       2.506        2.633
       2.380        2.643
       2.476        2.656
       2.846        2.670
       2.940        2.913
       2.453        2.653
       2.446        2.616
       2.986        2.683
       2.406        2.640
       2.460        2.650
       2.416        2.720

    -----------------------
Avg    2.426        2.688
StdDev 0.010        0.032

I wasn't able to get the actual execution plan, as the query never returned! It seems a bug is likely. (Running Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64))

Mysql – query records to fetch by time stamp

Since the seconds are from 0 (1970-01-01 00:00:00 UTC), you should look for every multiple of 60

SELECT * FROM mytable WHERE MOD(TimeStamp,60)=0;

or if TimeStamp is indexed, you can do

SELECT T.* FROM
(SELECT TimeStamp FROM mytable WHERE MOD(TimeStamp,60)=0) M
INNER JOIN mytable T USING (TimeStamp);

Give it a Try !!!

SUGGESTION #1

You should store the timestamp of the minute and index it

ALTER TABLE mytable ADD COLUMN MinuteTimeStamp AFTER TimeStamp;
UPDATE mytable SET MinuteTimeStamp = TimeStamp - MOD(TimeStamp,60);
ALTER TABLE mytable ADD INDEX MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey);

Then, you can do MIN aggregation on MinuteTimeStamp.

SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp;

and use it get those records

SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);

It was tactfully pointed out that triggers would degrade performance

Perhaps doing INSERTs like this may help

INSERT INTO mytable (UniqueKey,TimeStamp,MinuteTimeStamp) VALUES
(
    uniquevalue,
    UNIX_TIMESTAMP(NOW()),
    UNIX_TIMESTAMP(NOW() - INTERVAL SECOND(NOW()) SECOND)
);

SUGGESTION #2

Since you have over 1000 columns (Ugh), perhaps a table of those minute timestamps would be better.

CREATE TABLE MinuteKeys
(
    MinuteTimeStamp INT UNSIGNED NOT NULL,
    UniqueKey INT UNSIGNED NOT NULL,
    PRIMARY KEY (UniqueKey)
    KEY MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey)
) ENGINE=MyISAM;
ALTER TABLE MinuteKeys DISABLE KEYS;
INSERT INTO MinuteKeys SELECT TimeStamp - MOD(TimeStamp,60),UniqueKey FROM mytable;
ALTER TABLE MinuteKeys ENABLE KEYS;

Then, you could use that table for the aggregation

SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM MinuteKeys GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);

EPILOGUE

Other suggestions are possible but you should really consider normalization of the table

See my post Too many columns in MySQL as to why