Ok, I am making a lot of assumptions (INT instead of VARCHAR(50) being one of them) with this answer, so feel free to correct me if needed. The problem with option B is that it introduces a new join to relate Users to Alerts without any real added benefit. If joining on the UserID, it is best to index the UserID, so you can utilize seeks for your joins.
For Option A, UserID will be the clustering key (index key for the clustered index) on the Users table. UserID will be a nonclustered index key on Alerts table. This will cost 16 bytes per Alert.
For Option B, UserID will be the clustering key on the Users table. UserId will probably be the clustering key in UserMap too, to make joining more efficient. UserKey (assuming this is an INT) would then be a nonclustered index key on the Alerts table. This will cost 4 bytes per Alert. And 20 bytes per UserMap.
Looking at the big picture, one relationship, for Option A, costs 16 bytes of storage, and involves 1 join operation. Whereas, one relationship, for Option B, costs 24 bytes of storage, and involves 2 join operations.
Furthermore, there are a possibility of 340,282,366,920,938,000,000,000,000,000,000,000,000 uniqueidentifiers and only 4,294,967,296 INTs. Implementing a uniqueidentifier to INT map for a this type of relationship could cause unexpected results when you start reusing INTs.
The only reason for creating this type map table, is if you plan on creating a Many to Many relationship between Users and Alerts.
Taking all of this into consideration, I would recommend Option A.
I hope this helps,
Matt
You are correct to separate "clustered index" from "primary key":
- A clustered index is the organisation of data on disk is better if
- narrow
- numeric
- increasing (strictly monotonic)
- The primary key identifies a row
Note: GUIDs make poor clustering keys
In this case, with the surrogate column, the table has 2 candidate keys:
- ProductHistoryID
- ProductNo + CreatedDateTime
Assumed convention states that the ProductHistoryID becomes the PK, but you can leave the PK on (ProductNo, CreatedDateTime): it will just be non-clustered. Which leads to indexes:
- clustered index should be on ProductHistoryID
- unique non-clustered index on (ProductNo, CreatedDateTime)
Example
CREATE TABLE Product (
ProductHistoryID int NOT NULL IDENTITY (1,1) NOT NULL,
ProductNo ...
CreatedDateTime ...
then you a choice of
CONSTRAINT PK_Product PRIMARY KEY CLUSTERED (ProductHistoryID)
CONSTRAINT UQ_Product UNIQUE NONCLUSTERED (ProductHistoryID)
or
CONSTRAINT PK_Product PRIMARY KEY NONCLUSTERED (ProductNo, CreatedDateTime)
CONSTRAINT PK_Product UNIQUE CLUSTERED (ProductHistoryID)
Also, the pattern you have is a "type 2 Slowly Changing Dimension"
Best Answer
If you have a one-to-many relationship between Producers and Products (in other words, a product can only belong to one producer), than it would make sense to just put a foreign key reference directly in yourProducts
table:One-To-Many
But if there's a chance that this will be a many-to-many relationship, then your best bet would be to use a Join table.Many-To-Many
If you decide to go with the Join table, you wouldn't need to have an additional key, as the combination of
ProductId/ProducerId
would ultimately be unique. You could use them as a composite key, so you wouldn't need that additionalId
field inProductProducer
.