This is the relational division problem and there is a question about it at SO, with a lot of ways to write this query, plus performance analysis for PostgreSQL: How to filter SQL results in a has-many-through relation
Shamelessly copying code form there and removing/changing code for answers that have features lacking from MySQL, like CTEs, EXCEPT
, INTERSECT
, etc, here are a few ways to do this.
Assumptions:
- the table is called
factors
- there is a
UNIQUE
constraint on (wordid, docid)
- there is a
documents
and a words
table:
Easy to write, medium efficiency:
-- Query 1 -- by Martin
SELECT d.docid, d.docname
FROM document d
JOIN factors f USING (docid)
WHERE f.wordid IN (2, 4, 5)
GROUP BY d.docid
HAVING COUNT(*) = 3 ; -- number of words
Easy to write, medium efficiency:
-- Query 2 -- by Erwin
SELECT d.docid, d.docname
FROM documents d
JOIN (
SELECT docid
FROM factors
WHERE wordid IN (2, 4, 5)
GROUP BY docid
HAVING COUNT(*) = 3
) f USING (docid) ;
More complex to write, very good efficiency in Postgres - probably lousy in MySQL:
-- Query 4 -- by Derek
SELECT d.docid, d.docname
FROM documents d
WHERE d.docid IN (SELECT docid FROM factors WHERE wordid = 2)
AND d.docid IN (SELECT docid FROM factors WHERE wordid = 4);
AND d.docid IN (SELECT docid FROM factors WHERE wordid = 5);
More complex to write, very good efficiency in Postgres - and probably the same in MySQL:
-- Query 5 -- by Erwin
SELECT d.docid, d.docname
FROM documents d
WHERE EXISTS (SELECT * FROM factors
WHERE docid = d.docid AND wordid = 2)
AND EXISTS (SELECT * FROM factors
WHERE docid = d.docid AND wordid = 4)
AND EXISTS (SELECT * FROM factors
WHERE docid = d.docid AND wordid = 5) ;
More complex to write, very good efficiency in Postgres - and probably the same in MySQL:
-- Query 6 -- by Sean
SELECT d.docid, d.docname
FROM documents d
JOIN factors x ON d.docid = x.docid
JOIN factors y ON d.docid = y.docid
JOIN factors z ON d.docid = z.docid
WHERE x.wordid = 2
AND y.wordid = 4
AND z.wordid = 5 ;
Easy to write and extend to an arbitrary set of words
but not as efficient as the JOIN
and EXISTS
solutions:
-- Query 7 -- by ypercube
SELECT d.docid, d.docname
FROM documents d
WHERE NOT EXISTS (
SELECT *
FROM words AS w
WHERE w.wordid IN (2, 4, 5)
AND NOT EXISTS (
SELECT *
FROM factors AS f
WHERE f.docid = d.docid
AND f.wordid = w.wordid
)
);
Easy to write, not good efficiency:
-- Query 8 -- by ypercube
SELECT d.docid, d.docname
FROM documents d
WHERE NOT EXISTS (
SELECT *
FROM (
SELECT 2 AS wordid UNION ALL
SELECT 4 UNION ALL
SELECT 5
) AS w
WHERE NOT EXISTS (
SELECT *
FROM factors AS f
WHERE f.docid = d.docid
AND f.wordid = w.wordid
)
);
Enjoy testing them :)
You could use user defined variable in mysql to achieve your goal:
create table t ( id int not null auto_increment,
col_a int not null,
col_b int not null,
primary key(id) );
insert into t(col_a, col_b) values (1,20), (1,70), (1,110), (1,30);
set @sum := 0;
select *, col_a * col_b, @sum as s
from t
where (@sum := @sum + (col_a * col_b)) <= 200
order by id;
+----+-------+-------+---------------+------+
| id | col_a | col_b | col_a * col_b | s |
+----+-------+-------+---------------+------+
| 1 | 1 | 20 | 20 | 20 |
| 2 | 1 | 70 | 70 | 90 |
| 3 | 1 | 110 | 110 | 200 |
+----+-------+-------+---------------+------+
3 rows in set (0.00 sec)
You must use order by
to make sure of the rows order as suggested by ypercube. For example you can sum column in reverse order.
set @sum := 0;
select *, col_a * col_b, @sum as s
from t where (@sum := @sum + (col_a * col_b)) <= 200
order by id desc;
+----+-------+-------+---------------+------+
| id | col_a | col_b | col_a * col_b | s |
+----+-------+-------+---------------+------+
| 4 | 1 | 30 | 30 | 30 |
| 3 | 1 | 110 | 110 | 140 |
+----+-------+-------+---------------+------+
2 rows in set (0.00 sec)
Best Answer
This requirement is known as relational division. One popular approach is
SQL Fiddle
If your table has a unique constraint on
sentence_id,meaning_id
you can drop theDISTINCT
in the query above.