Ok, I am making a lot of assumptions (INT instead of VARCHAR(50) being one of them) with this answer, so feel free to correct me if needed. The problem with option B is that it introduces a new join to relate Users to Alerts without any real added benefit. If joining on the UserID, it is best to index the UserID, so you can utilize seeks for your joins.
For Option A, UserID will be the clustering key (index key for the clustered index) on the Users table. UserID will be a nonclustered index key on Alerts table. This will cost 16 bytes per Alert.
For Option B, UserID will be the clustering key on the Users table. UserId will probably be the clustering key in UserMap too, to make joining more efficient. UserKey (assuming this is an INT) would then be a nonclustered index key on the Alerts table. This will cost 4 bytes per Alert. And 20 bytes per UserMap.
Looking at the big picture, one relationship, for Option A, costs 16 bytes of storage, and involves 1 join operation. Whereas, one relationship, for Option B, costs 24 bytes of storage, and involves 2 join operations.
Furthermore, there are a possibility of 340,282,366,920,938,000,000,000,000,000,000,000,000 uniqueidentifiers and only 4,294,967,296 INTs. Implementing a uniqueidentifier to INT map for a this type of relationship could cause unexpected results when you start reusing INTs.
The only reason for creating this type map table, is if you plan on creating a Many to Many relationship between Users and Alerts.
Taking all of this into consideration, I would recommend Option A.
I hope this helps,
Matt
SELECT B.name
FROM
(
SELECT BB.listing_id id,COUNT(1) taxon_count
FROM
(
SELECT id taxon_id FROM taxons
WHERE name IN ('Ford','Exhaust')
) AA
INNER JOIN listings_taxons BB
USING (taxon_id)
GROUP BY listing_id HAVING COUNT(1) = 2
) A
INNER JOIN listings B USING (id);
Subquery A will bring back all listing_ids that have Ford, Exhaust, or both. Doing the GROUP BY count within Subquery A gives any listing id that has a COUNT(1) of 2 has both Ford and Exhaust taxon ids becasue BB.listing_id would appears twice thus HAVING COUNT(1) = 2. Then Subquery A has an INNER JOIN with listings.
Make sure you have the following indexes
ALTER TABLE listings_taxons ADD INDEX taxon_listing_ndx (taxon_id,listing_id);
ALTER TABLE taxons ADD INDEX name_id_ndx (name,id);
Here is some sample data
drop database if exists nwwatson;
create database nwwatson;
use nwwatson
create table listings
(id int not null auto_increment,
name varchar(25),
primary key (id),
key (name));
create table taxons like listings;
create table listings_taxons
(
listing_id int,
taxon_id int,
primary key (listing_id,taxon_id),
unique key (taxon_id,listing_id)
);
insert into listings (name) values ('SteeringWheel'),('WindShield'),('Muffler'),('AC');
insert into taxons (name) values ('Ford'),('Escort'),('Buick'),('Exhaust'),('Mustard');
insert into listings_taxons values
(1,1),(1,3),(1,5),(2,1),(2,2),(2,3),(2,5),
(3,1),(3,4),(4,2),(4,3),(4,4),(5,1),(5,5);
SELECT * FROM listings;
SELECT * FROM taxons;
SELECT * FROM listings_taxons;
SELECT B.name
FROM
(
SELECT BB.listing_id id,COUNT(1) taxon_count
FROM
(
SELECT id taxon_id FROM taxons
WHERE name IN ('Ford','Exhaust')
) AA
INNER JOIN listings_taxons BB
USING (taxon_id)
GROUP BY listing_id HAVING COUNT(1) = 2
) A
INNER JOIN listings B USING (id);
Here is it executed
mysql> drop database if exists nwwatson;
Query OK, 3 rows affected (0.09 sec)
mysql> create database nwwatson;
Query OK, 1 row affected (0.00 sec)
mysql> use nwwatson
Database changed
mysql> create table listings
-> (
-> id int not null auto_increment,
-> name varchar(25),
-> primary key (id),
-> key (name)
-> );
Query OK, 0 rows affected (0.08 sec)
mysql> create table taxons like listings;
Query OK, 0 rows affected (0.05 sec)
mysql> create table listings_taxons
-> (
-> listing_id int,
-> taxon_id int,
-> primary key (listing_id,taxon_id),
-> unique key (taxon_id,listing_id)
-> );
Query OK, 0 rows affected (0.08 sec)
mysql> insert into listings (name) values ('SteeringWheel'),('WindShield'),('Muffler'),('AC');
Query OK, 4 rows affected (0.06 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> insert into taxons (name) values ('Ford'),('Escort'),('Buick'),('Exhaust'),('Mustard');
Query OK, 5 rows affected (0.06 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> insert into listings_taxons values
-> (1,1),(1,3),(1,5),(2,1),(2,2),(2,3),(2,5),
-> (3,1),(3,4),(4,2),(4,3),(4,4),(5,1),(5,5);
Query OK, 14 rows affected (0.11 sec)
Records: 14 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM listings;
+----+---------------+
| id | name |
+----+---------------+
| 4 | AC |
| 3 | Muffler |
| 1 | SteeringWheel |
| 2 | WindShield |
+----+---------------+
4 rows in set (0.00 sec)
mysql> SELECT * FROM taxons;
+----+---------+
| id | name |
+----+---------+
| 3 | Buick |
| 2 | Escort |
| 4 | Exhaust |
| 1 | Ford |
| 5 | Mustard |
+----+---------+
5 rows in set (0.00 sec)
mysql> SELECT * FROM listings_taxons;
+------------+----------+
| listing_id | taxon_id |
+------------+----------+
| 1 | 1 |
| 1 | 3 |
| 1 | 5 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 2 | 5 |
| 3 | 1 |
| 3 | 4 |
| 4 | 2 |
| 4 | 3 |
| 4 | 4 |
| 5 | 1 |
| 5 | 5 |
+------------+----------+
14 rows in set (0.00 sec)
mysql> SELECT B.name
-> FROM
-> (
-> SELECT BB.listing_id id,COUNT(1) taxon_count
-> FROM
-> (
-> SELECT id taxon_id FROM taxons
-> WHERE name IN ('Ford','Exhaust')
-> ) AA
-> INNER JOIN listings_taxons BB
-> USING (taxon_id)
-> GROUP BY listing_id HAVING COUNT(1) = 2
-> ) A
-> INNER JOIN listings B USING (id);
+---------+
| name |
+---------+
| Muffler |
+---------+
1 row in set (0.00 sec)
mysql>
Give it a Try !!!
Best Answer
You can try something like:
Explaination: first we need to know the number of DVDs available.
Then we need to get the number of distinct DVDs purchased by the customers, and to compare it with the total we got previously.
And finally, we just need to join with customers table to retrieve the names.