SQL Server 2008 R2 – Gigantic Gaps in IDENTITY Column

identitysql serversql-server-2008-r2

This question is about strange behavior in SQL Server 2008 R2 SP3 Standard Edition x64.

I have read many questions about IDENTITY increments in SQL Server and strange gaps between them. But basically people say it is related to transactions or restarts. In my case (using a named instance) gaps are simply gigantic.

Application inserted 130 000 records in table, and "Arithmetic overflow IDENTITY conversion to int" error occurred. We truncate the table, and bulk insert 70 000 records. During bulk insert IDENTITY column maximal value is above 9 000 000!

As it was not normal load but mass insert – there was no rollback, restart nor any other issue during this operation.

So gaps are not in the size of 1000 but tenths of thousands.

As I do not control application code (it is commercial software bought from other company) I am simply afraid what can I do as Administrator of such database? Have I to truncate table once every 6 months? Monitor in real time ID column?

Gaps are not in the size of 1000 but tenths of thousands. Are there any pupils experiencing similar issue?

Best Answer

It is very much normal for there to be gaps in identity values (caused by rolled-back transactions, over estimated pre-allocations, and so forth) as you are already aware, though that sort of sparse use of values is rather unusual and indicates that there is something going on which you need to be aware of (either a design issue that may be unavoidable and you just need to plan around or a fault that needs fixing).

The first thing I would ask is are you touching the IDENTITY column at all in the bulk insert? If you are working with IDENTITY_INSERT ON or similar circumstances and the imported data contains a large number there then that is the source of the problem. Another thing to check for is hidden processes that might touch the table - do you have any triggers that might be firing as data is added to the table and doing something odd?

Otherwise, if you could provide more detail in your question that might help someone help you:

  • The structure of the table in question (and any others touched by the same bulk process): column/key/index definitions and any triggers
  • The code you are using for the bulk insert (TSQL, a call to BCP, what-ever method you are using)
  • A sample of the data (perhaps anonymised if it is sensitive)
  • When you performed your test was the database otherwise quiet or was it active with other transactions potentially being processed?
  • Are the gaps following any sort of regular pattern, or do they seem random?

As a work-around you might consider using a BIGINT for the column instead, though if it is referred to by foreign keys elsewhere that could be a big job to change (and if the value isn't purely internal to the database you need to consider if the code touching it can cope with values beyond the 32-bit limits).