Sql-server – Indexing – Uniqueidentifier Foreign Key or Intermediary mapping table

indexschemasql serveruniqueidentifier

Looking for some expert views on this guys – I'm not a DBA by trade so would appreciate any advice. Comments on the rest of the schema aren't required as it is purely fabricated to illustrate my question.

Basically I'm working on a new feature for an existing system and am powerless to change the existing schema. So let's say that there currently is a Users table which has a primary key UserID that is of type uniqueidentifier. I want to create a new table called Alerts – related to the Users table in a one-to-many relationship (one User can have many Alerts). Alerts will be regularly queried by this foreign key. I will expect to index this field for performance.

My question is – from what I understand about Indexing strategies, indexing a uniqueidentifier column is (delicately put) not a good idea – so would it be a better idea to have an intermediary table to map these uniqueidentifier keys into a better indexing candidate like an integer?

Users

  • UserID uniqueidentifier {PK}
  • Username varchar(50)

Alerts

  • AlertID int {PK}
  • UserID uniqueidentifier {FK}
  • Message varchar(250)
  • Timestamp datetime

OR

Users

  • UserID uniqueidentifier {PK}
  • Username varchar(50)

UserMap

  • UserKey int identity(1,1) {PK}
  • UserID uniqueidentifier {FK}

Alerts

  • AlertID int identity(1,1) {PK}
  • UserKey int {FK}
  • Message varchar(250)
  • Timestamp datetime

Best Answer

Ok, I am making a lot of assumptions (INT instead of VARCHAR(50) being one of them) with this answer, so feel free to correct me if needed. The problem with option B is that it introduces a new join to relate Users to Alerts without any real added benefit. If joining on the UserID, it is best to index the UserID, so you can utilize seeks for your joins.

For Option A, UserID will be the clustering key (index key for the clustered index) on the Users table. UserID will be a nonclustered index key on Alerts table. This will cost 16 bytes per Alert.

For Option B, UserID will be the clustering key on the Users table. UserId will probably be the clustering key in UserMap too, to make joining more efficient. UserKey (assuming this is an INT) would then be a nonclustered index key on the Alerts table. This will cost 4 bytes per Alert. And 20 bytes per UserMap.

Looking at the big picture, one relationship, for Option A, costs 16 bytes of storage, and involves 1 join operation. Whereas, one relationship, for Option B, costs 24 bytes of storage, and involves 2 join operations.

Furthermore, there are a possibility of 340,282,366,920,938,000,000,000,000,000,000,000,000 uniqueidentifiers and only 4,294,967,296 INTs. Implementing a uniqueidentifier to INT map for a this type of relationship could cause unexpected results when you start reusing INTs.

The only reason for creating this type map table, is if you plan on creating a Many to Many relationship between Users and Alerts.

Taking all of this into consideration, I would recommend Option A.

I hope this helps,

Matt