SQL Server Identity Property – Why Removing Identity Property on a Column is Not Supported

identitysql server

I have read that after SQL Server 2000, the ability to "un-identity" an identity column was removed. And that this was "By Design" (not just a missing feature).

Here is an example I found on a blog. It involves updating the system tables. (And that ability was removed after SQL Server 2000.) I get that doing this via system tables is not a good idea. I am just wondering why a feature to do this another way is not around.

Working around this is going to cause me a considerable amount of work. (Copying many hundreds of millions of rows to new tables in a downtime intolerant environment.)

So I thought I would ask "Why".

What changed in Sql Server 2005 and later versions that made this a bad thing? Or was it always bad, and just not locked down?

What "Best Practice" (or similar principle) would be violated by making an identity column a normal column again?

—

Update to answer the request for "why I am doing this":
This is a very high level summary: I am going to start adding partitions to my tables. (So that I can archive/purge old data.) That is all easy. But I occasionally need to move a record to a different partition so it does not get removed (when a partition comes up for archival/deletion). (I am having my partitioning column increase by 2 so that there is always space to move the row to a different partition.)

But if the partitioning column is an identity column, then I have to delete and re-insert the value (there is no way to update the value of an identity column). Which causes issues with replication.

So I am wanting to use a sequence instead of an identity column. But that switch is very very hard on large databases.

Best Answer

Your question is, essentially:

Why can I no longer do this risky thing that I should never have been allowed to do in the first place?

The answer to that question is largely irrelevant (though you can see some Microsoft comments in these Connect items asking for this functionality: #294193 and #252226). For completeness, my synopsis is: The ability to remove the identity property was an unintended side effect of having the ability to mess with the system tables in the first place. This was not intended to be used in the many ways that it was, often with very bad consequences, and thus it was removed. It was an undocumented, unsupported, system table hack. The ability to change data in system tables wasn't removed because Microsoft no longer wanted you to hack your way out of a column being an identity column, it was removed because mucking with the system tables is extremely risky. Removing the IDENTITY property itself wasn't a specifically targeted feature removal, and I would have never fully trusted this approach even back in the ancient days when it was possible.

That said, how about we answer this question instead?

How do I remove the IDENTITY property of a column with minimal or no downtime?

This you can do easily, using ALTER TABLE ... SWITCH, a technique I am certain I first learned from our own Paul White in the workarounds for Connect #252226. Quick example, given this simple table:

CREATE TABLE dbo.Original
(
  ID INT IDENTITY(1,1) PRIMARY KEY,
  name SYSNAME
);
GO

INSERT dbo.Original(name) VALUES(N'foo'),(N'bar');
GO

SELECT * FROM dbo.Original;
GO

Results:

ID  name
--  ----
1   foo
2   bar

Now, let's create a shadow table, and switch to it, then drop the old table, rename the new one, and then resume normal activity:

CREATE TABLE dbo.New
(
  ID INT PRIMARY KEY,
  name SYSNAME
);
GO

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
  ALTER TABLE dbo.Original SWITCH TO dbo.New;
  DROP TABLE dbo.Original;
  EXEC sys.sp_rename N'dbo.New', N'Original', 'OBJECT';
COMMIT TRANSACTION;
GO

INSERT dbo.Original(ID,name) VALUES(3,N'splunge');
UPDATE dbo.Original SET ID = 6 WHERE ID = 1;
GO

SELECT * FROM dbo.Original;
GO

Results:

ID  name
--  -------
2   bar
3   splunge
6   foo

Now clean up:

DROP TABLE dbo.Original;

This is a metadata operation only, with no data movement, and will only block other users while the metadata is being updated. But, admittedly, it is a very simplistic example. If you have foreign keys or are using other features like replication, Change Data Capture, Change Tracking, etc. you may need to disable or remove some of those before making this change (I haven't tested all combinations). For foreign keys specifically, see this tip which shows how to generate scripts to drop and re-create all (or selected) foreign key constraints.

Additionally, you will need to update your application code to not expect SQL Server to populate this column, and check any insert or select statements that may be depending on the order of columns or the columns they need to specify. Generally, I would grep your entire code base for any mention of this table.

Also see this script from Itzik Ben-Gan (source: this ancient article) for another way to handle this, but there is data movement involved here, so it doesn't deliver on the "no or minimal downtime" requirement.

Related Solutions

Sql-server – Dropping then adding a table to a subscription with Identity option Not for Replication causes errors

I have found a solution though adding:

@identityrangemanagementoption = N'manual'

when I re-add the article using sp_addarticle, means that the automatically created sps used to replicate the data don't try and update the identity column and thus are created succesfully on the subscriber rather than raising the error.

Why do people recommend not using the name “Id” for an identity column

The table name prefix has very good reasons.

Consider:

TableA (id int identity, stringdata varchar(max))

TableB (id int identity, stringdata varchar(max))

We want to DELETE from TableA records that exist in both tables. Easy enough, we will just do an INNER JOIN:

DELETE a
FROM 
  TableA A
INNER JOIN 
  TableB B
    ON b.id = B.id

....and we just wiped out all of TableA. We inadvertently compared B's ID to itself - every record matched, and every record got deleted.

If the fields had been named TableAId and TableBId this would be impossible (Invalid field name TableAid in TableB).

Personally I have no issue with using the name id in a table, but it's really a better practice to preface it with the table name (or entity name, if TableA were people then PeopleId would work fine too) to avoid accidentally comparing to the wrong field and blowing something up.

This also makes it very obvious where fields come from in long queries with lots of JOINs.

Best Answer

Related Solutions

Sql-server – Dropping then adding a table to a subscription with Identity option Not for Replication causes errors

Why do people recommend not using the name “Id” for an identity column

Related Question