Sql-server – alter primary id field to uniqueidentifier GUID as default in sql server

alter-tableprimary-keysql serversql-server-2008-r2uniqueidentifier

I have tables with id(primary key – int) with identity specification (1,1) .
I want to change this column to uniqueidentifier with default ( newid() ) .

What I've tried is

ALTER TABLE myTable ALTER COLUMN id uniqueidentifier default NEWID();  

but I got this message

Incorrect syntax near the keyword 'default'.

Additionally ,every tables of my database has id primary key column and I want to change them to uniqueidentifier with default newID() , by looping or something .

Best Answer

every tables of my database has id primary key column and I want to change them to nvarchar(36) with default newID()

No. You do not want to make that change as no good can come of it.

You currently have a PK (Clustered, I assume) that is:

  1. Compact: 4 bytes per row
  2. Efficient: comparisons are simple 4-byte-value-to-4-byte-value (i.e. a binary comparison)
  3. Minimally Fragmenting: new values are sequential and added to the end of the table.

You are wanting to change this to a Clustered PK that is:

  1. Wide: 72 bytes per row (GUID is 16 bytes, which is 32 characters in string form for the hex values plus 4 dashes for 36 characters total, which is 72 bytes in UTF-16 / NVARCHAR).
  2. Inefficient: If using UNIQUEIDENTIFIER it is not so bad as that is still a binary comparisons, just like with INT, but is 16 bytes unlike INT which is 4. Storing this as a string is now a 36 character comparison (36 bytes in VARCHAR and 72 bytes in NVARCHAR) which is slower than the 16 byte UNIQUEIDENTIFIER. Finally, most people storing GUIDs as strings forget to use a binary Collation (e.g. Latin1_General_100_BIN2) to at least make the comparisons byte-by-byte since linguistic rules aren't necessary. Using a case-insensitive Collation, or even a case-sensitive one, is definitely slower as it will apply locale-based linguistic rules.
  3. Highly Fragmenting: new values are all over the place, leading to greatly increased page splits. And if you decrease FILLFACTOR to reduce the number of page splits, you are also reducing the performance of the index as it is spread out over a larger amount of pages.

Please keep in mind the down-stream negative effects of this change due to:

  1. Clustered index keys are copied into non-clustered indexes. Assuming, again, that this PK is Clustered, then each non-clustered index on this table will have that 72 byte value copied into it. Three non-clustered indexes on this table is 72 bytes * 3 = 216 bytes plus the original 72 of the Clustered index = 288 bytes total. On the other hand, the current INT would only be 4 bytes * 3 = 12 bytes plus the original 4 = 16 bytes total. Per row.

  2. PKs are often used by FKs, which is the PK copied into one or more other tables. 72 bytes per row is just for this table. If this PK is used in 2 other tables as a Foreign Key, then that is 72 bytes * 2 = 144 extra bytes. On the other hand the current INT would only be 4 bytes * 2 = 8 bytes.

    Is the FK column indexed? If yes, then that is another 72 bytes instead of 4.

  3. Data pages are loaded into memory (i.e. the Buffer Pool) before their row(s) can be read and used. Larger rows and/or lower FILLFACTOR means more data pages are required to hold those rows. That means more time to read them from disk into memory, and obviously the more memory they will require. This need competes with other queries, plan cache, etc.

If you need to have a GUID to have a value that is know to external systems, just add a UNIQUEIDENTIFIER column and index it; you can then look it up to get the INT value to use for all other JOINs, etc. And, if it has to be stored as a string, then use VARCHAR given the only characters are A, B, C, D, E, F, and - (NVARCHAR is a completely unnecessary waste of space) and be sure to specify a binary Collation (one ending in _BIN2).

But don't change the current structure.


Regarding the update to the question in which it was clarified that the intended datatype to change to is really UNIQUEIDENTIFIER and not NVARCHAR(36): the overall advice does not change from "do not do it". While UNIQUEIDENTIFIER is a better choice than NVARCHAR(36) (smaller and comparisons are binary), it is really just a "less bad" choice more than a "better / good choice".