Ok, I am making a lot of assumptions (INT instead of VARCHAR(50) being one of them) with this answer, so feel free to correct me if needed. The problem with option B is that it introduces a new join to relate Users to Alerts without any real added benefit. If joining on the UserID, it is best to index the UserID, so you can utilize seeks for your joins.
For Option A, UserID will be the clustering key (index key for the clustered index) on the Users table. UserID will be a nonclustered index key on Alerts table. This will cost 16 bytes per Alert.
For Option B, UserID will be the clustering key on the Users table. UserId will probably be the clustering key in UserMap too, to make joining more efficient. UserKey (assuming this is an INT) would then be a nonclustered index key on the Alerts table. This will cost 4 bytes per Alert. And 20 bytes per UserMap.
Looking at the big picture, one relationship, for Option A, costs 16 bytes of storage, and involves 1 join operation. Whereas, one relationship, for Option B, costs 24 bytes of storage, and involves 2 join operations.
Furthermore, there are a possibility of 340,282,366,920,938,000,000,000,000,000,000,000,000 uniqueidentifiers and only 4,294,967,296 INTs. Implementing a uniqueidentifier to INT map for a this type of relationship could cause unexpected results when you start reusing INTs.
The only reason for creating this type map table, is if you plan on creating a Many to Many relationship between Users and Alerts.
Taking all of this into consideration, I would recommend Option A.
I hope this helps,
Matt
The two scripts in RThomas' answer are both useful. You could also use GROUP BY, which gives a similar advantage to RThomas' methods, but keeping a similar form to your original query.
select country
from Users inner join
countries on users.CountryID=countries.CountryID
GROUP BY countries.CountryID, countries.country;
The reason why you group by CountryID is that it's the primary key of your countries table, giving the Query Optimizer some better options.
...except that it's not in your scripts.
Put PKs (with Clustered Indexes) on your tables, and a FK relationship between them. Index CountryID in the Users table, and put a Unique Index on the Country field.
Once you've done all that, using DISTINCT how you have will actually give you the ideal execution plan.
Best Answer
If you do not specify an
order by
for the result set of your first query then your data is not guaranteed to be ordered.It does appear that you want to order by the minimal
id
field in your table based onPRIMARY KEY CLUSTERED ( [id] ASC)
.One way would be by using a
CTE
andMIN(id)
Test data
Result