SELECT B.name
FROM
(
SELECT BB.listing_id id,COUNT(1) taxon_count
FROM
(
SELECT id taxon_id FROM taxons
WHERE name IN ('Ford','Exhaust')
) AA
INNER JOIN listings_taxons BB
USING (taxon_id)
GROUP BY listing_id HAVING COUNT(1) = 2
) A
INNER JOIN listings B USING (id);
Subquery A will bring back all listing_ids that have Ford, Exhaust, or both. Doing the GROUP BY count within Subquery A gives any listing id that has a COUNT(1) of 2 has both Ford and Exhaust taxon ids becasue BB.listing_id would appears twice thus HAVING COUNT(1) = 2. Then Subquery A has an INNER JOIN with listings.
Make sure you have the following indexes
ALTER TABLE listings_taxons ADD INDEX taxon_listing_ndx (taxon_id,listing_id);
ALTER TABLE taxons ADD INDEX name_id_ndx (name,id);
Here is some sample data
drop database if exists nwwatson;
create database nwwatson;
use nwwatson
create table listings
(id int not null auto_increment,
name varchar(25),
primary key (id),
key (name));
create table taxons like listings;
create table listings_taxons
(
listing_id int,
taxon_id int,
primary key (listing_id,taxon_id),
unique key (taxon_id,listing_id)
);
insert into listings (name) values ('SteeringWheel'),('WindShield'),('Muffler'),('AC');
insert into taxons (name) values ('Ford'),('Escort'),('Buick'),('Exhaust'),('Mustard');
insert into listings_taxons values
(1,1),(1,3),(1,5),(2,1),(2,2),(2,3),(2,5),
(3,1),(3,4),(4,2),(4,3),(4,4),(5,1),(5,5);
SELECT * FROM listings;
SELECT * FROM taxons;
SELECT * FROM listings_taxons;
SELECT B.name
FROM
(
SELECT BB.listing_id id,COUNT(1) taxon_count
FROM
(
SELECT id taxon_id FROM taxons
WHERE name IN ('Ford','Exhaust')
) AA
INNER JOIN listings_taxons BB
USING (taxon_id)
GROUP BY listing_id HAVING COUNT(1) = 2
) A
INNER JOIN listings B USING (id);
Here is it executed
mysql> drop database if exists nwwatson;
Query OK, 3 rows affected (0.09 sec)
mysql> create database nwwatson;
Query OK, 1 row affected (0.00 sec)
mysql> use nwwatson
Database changed
mysql> create table listings
-> (
-> id int not null auto_increment,
-> name varchar(25),
-> primary key (id),
-> key (name)
-> );
Query OK, 0 rows affected (0.08 sec)
mysql> create table taxons like listings;
Query OK, 0 rows affected (0.05 sec)
mysql> create table listings_taxons
-> (
-> listing_id int,
-> taxon_id int,
-> primary key (listing_id,taxon_id),
-> unique key (taxon_id,listing_id)
-> );
Query OK, 0 rows affected (0.08 sec)
mysql> insert into listings (name) values ('SteeringWheel'),('WindShield'),('Muffler'),('AC');
Query OK, 4 rows affected (0.06 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> insert into taxons (name) values ('Ford'),('Escort'),('Buick'),('Exhaust'),('Mustard');
Query OK, 5 rows affected (0.06 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> insert into listings_taxons values
-> (1,1),(1,3),(1,5),(2,1),(2,2),(2,3),(2,5),
-> (3,1),(3,4),(4,2),(4,3),(4,4),(5,1),(5,5);
Query OK, 14 rows affected (0.11 sec)
Records: 14 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM listings;
+----+---------------+
| id | name |
+----+---------------+
| 4 | AC |
| 3 | Muffler |
| 1 | SteeringWheel |
| 2 | WindShield |
+----+---------------+
4 rows in set (0.00 sec)
mysql> SELECT * FROM taxons;
+----+---------+
| id | name |
+----+---------+
| 3 | Buick |
| 2 | Escort |
| 4 | Exhaust |
| 1 | Ford |
| 5 | Mustard |
+----+---------+
5 rows in set (0.00 sec)
mysql> SELECT * FROM listings_taxons;
+------------+----------+
| listing_id | taxon_id |
+------------+----------+
| 1 | 1 |
| 1 | 3 |
| 1 | 5 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 2 | 5 |
| 3 | 1 |
| 3 | 4 |
| 4 | 2 |
| 4 | 3 |
| 4 | 4 |
| 5 | 1 |
| 5 | 5 |
+------------+----------+
14 rows in set (0.00 sec)
mysql> SELECT B.name
-> FROM
-> (
-> SELECT BB.listing_id id,COUNT(1) taxon_count
-> FROM
-> (
-> SELECT id taxon_id FROM taxons
-> WHERE name IN ('Ford','Exhaust')
-> ) AA
-> INNER JOIN listings_taxons BB
-> USING (taxon_id)
-> GROUP BY listing_id HAVING COUNT(1) = 2
-> ) A
-> INNER JOIN listings B USING (id);
+---------+
| name |
+---------+
| Muffler |
+---------+
1 row in set (0.00 sec)
mysql>
Give it a Try !!!
Your query does not do what you think it does. It results in much more rows than you want to because of the cross join. The only reason you haven't noticed it, is that the excessive number of rows produced are not really inserted in the 3rd table due to duplicate keys which trigger either the INSERT IGNORE
or the ON DUPLICATE KEY UPDATE
(And you should really keep one of these two, there is absolutely no reason to have both IGNORE
and ON DUPLICATE KEY UPDATE
. MySQL will either ignore the duplicate keys or try to update the existing row. Does it make sense to ignore someone and talk to him at the same time?)
You probably need:
FROM contacts c
LEFT JOIN dnc_contacts d
ON d.contact = c.contact
AND d.group_id = 21
WHERE c.phonebook_id = 70
AND d.contact IS NULL
Try the above (just SELECT
first, without inserting), to see if it gives you the wanted results.
An index on dnc_contacts (group_id, contact)
would help with efficiency.
Best Answer
Your cartesian product should be the product of # of rows in properly joined tables times the rows in tables that are not properly joined (i.e. don't have effective condition in the
WHERE
clause). For cases wherespecimen.snop_axis = 'M'
morphology and specimen are propely joined but functions isn't, and whenspecimen.snop_axis <> 'M'
functions and specimen are properly joined but morphology isn't. That's most likely why the resulting number of rows doesn't seem to be any direct product of rows in the tables.To solve this I'd first try to join all the tables properly by replacing
with
To me, this seems to give ok result with the sample data, but I wouldn't be too sure of how it behaves with your real data.
I'm not completely sure what you are trying to obtain by having that
IF
condition inWHERE
clause. If the objective is to return morphology data and no functions data whenspecimen.snop_axis='M'
and in other cases to return functions data and no morphology data, I'd suggest having the selection logic applied in the columns that are returned, somehow like this:instead of just having columns from both tables returned always. But as I said, I'm not quite sure I understand what is required.