Sql-server – alter primary id field to uniqueidentifier GUID as default in sql server

alter-tableprimary-keysql serversql-server-2008-r2uniqueidentifier

I have tables with id(primary key – int) with identity specification (1,1) .
I want to change this column to uniqueidentifier with default ( newid() ) .

What I've tried is

ALTER TABLE myTable ALTER COLUMN id uniqueidentifier default NEWID();

but I got this message

Incorrect syntax near the keyword 'default'.

Additionally ,every tables of my database has id primary key column and I want to change them to uniqueidentifier with default newID() , by looping or something .

Best Answer

every tables of my database has id primary key column and I want to change them to nvarchar(36) with default newID()

No. You do not want to make that change as no good can come of it.

You currently have a PK (Clustered, I assume) that is:

Compact: 4 bytes per row
Efficient: comparisons are simple 4-byte-value-to-4-byte-value (i.e. a binary comparison)
Minimally Fragmenting: new values are sequential and added to the end of the table.

You are wanting to change this to a Clustered PK that is:

Wide: 72 bytes per row (GUID is 16 bytes, which is 32 characters in string form for the hex values plus 4 dashes for 36 characters total, which is 72 bytes in UTF-16 / NVARCHAR).
Inefficient: If using UNIQUEIDENTIFIER it is not so bad as that is still a binary comparisons, just like with INT, but is 16 bytes unlike INT which is 4. Storing this as a string is now a 36 character comparison (36 bytes in VARCHAR and 72 bytes in NVARCHAR) which is slower than the 16 byte UNIQUEIDENTIFIER. Finally, most people storing GUIDs as strings forget to use a binary Collation (e.g. Latin1_General_100_BIN2) to at least make the comparisons byte-by-byte since linguistic rules aren't necessary. Using a case-insensitive Collation, or even a case-sensitive one, is definitely slower as it will apply locale-based linguistic rules.
Highly Fragmenting: new values are all over the place, leading to greatly increased page splits. And if you decrease FILLFACTOR to reduce the number of page splits, you are also reducing the performance of the index as it is spread out over a larger amount of pages.

Please keep in mind the down-stream negative effects of this change due to:

Clustered index keys are copied into non-clustered indexes. Assuming, again, that this PK is Clustered, then each non-clustered index on this table will have that 72 byte value copied into it. Three non-clustered indexes on this table is 72 bytes * 3 = 216 bytes plus the original 72 of the Clustered index = 288 bytes total. On the other hand, the current INT would only be 4 bytes * 3 = 12 bytes plus the original 4 = 16 bytes total. Per row.
PKs are often used by FKs, which is the PK copied into one or more other tables. 72 bytes per row is just for this table. If this PK is used in 2 other tables as a Foreign Key, then that is 72 bytes * 2 = 144 extra bytes. On the other hand the current INT would only be 4 bytes * 2 = 8 bytes.

Is the FK column indexed? If yes, then that is another 72 bytes instead of 4.
Data pages are loaded into memory (i.e. the Buffer Pool) before their row(s) can be read and used. Larger rows and/or lower FILLFACTOR means more data pages are required to hold those rows. That means more time to read them from disk into memory, and obviously the more memory they will require. This need competes with other queries, plan cache, etc.

If you need to have a GUID to have a value that is know to external systems, just add a UNIQUEIDENTIFIER column and index it; you can then look it up to get the INT value to use for all other JOINs, etc. And, if it has to be stored as a string, then use VARCHAR given the only characters are A, B, C, D, E, F, and - (NVARCHAR is a completely unnecessary waste of space) and be sure to specify a binary Collation (one ending in _BIN2).

But don't change the current structure.

Regarding the update to the question in which it was clarified that the intended datatype to change to is really UNIQUEIDENTIFIER and not NVARCHAR(36): the overall advice does not change from "do not do it". While UNIQUEIDENTIFIER is a better choice than NVARCHAR(36) (smaller and comparisons are binary), it is really just a "less bad" choice more than a "better / good choice".

Related Solutions

Sql-server – Indexing – Uniqueidentifier Foreign Key or Intermediary mapping table

Ok, I am making a lot of assumptions (INT instead of VARCHAR(50) being one of them) with this answer, so feel free to correct me if needed. The problem with option B is that it introduces a new join to relate Users to Alerts without any real added benefit. If joining on the UserID, it is best to index the UserID, so you can utilize seeks for your joins.

For Option A, UserID will be the clustering key (index key for the clustered index) on the Users table. UserID will be a nonclustered index key on Alerts table. This will cost 16 bytes per Alert.

For Option B, UserID will be the clustering key on the Users table. UserId will probably be the clustering key in UserMap too, to make joining more efficient. UserKey (assuming this is an INT) would then be a nonclustered index key on the Alerts table. This will cost 4 bytes per Alert. And 20 bytes per UserMap.

Looking at the big picture, one relationship, for Option A, costs 16 bytes of storage, and involves 1 join operation. Whereas, one relationship, for Option B, costs 24 bytes of storage, and involves 2 join operations.

Furthermore, there are a possibility of 340,282,366,920,938,000,000,000,000,000,000,000,000 uniqueidentifiers and only 4,294,967,296 INTs. Implementing a uniqueidentifier to INT map for a this type of relationship could cause unexpected results when you start reusing INTs.

The only reason for creating this type map table, is if you plan on creating a Many to Many relationship between Users and Alerts.

Taking all of this into consideration, I would recommend Option A.

I hope this helps,

Matt

SQL Server – How to Alter Time of DateTime Field

First thing is you will need to stop performing shorthand math operations like + against date/time values. This breaks when you switch to the newer types, which you should do.

For this particular problem, you can simply use a CASE expression to either (a) return the datetime as is, or (b) convert to date and then add 11 hours.

CASE WHEN DATEPART(HOUR, col) < 11 
  THEN col
  ELSE DATEADD(HOUR, 11, CONVERT(datetime, CONVERT(date, col)))
END

Best Answer

Related Solutions

Sql-server – Indexing – Uniqueidentifier Foreign Key or Intermediary mapping table

SQL Server – How to Alter Time of DateTime Field

Related Question