I've a scheme of two main tables: problem
and tag
and a relation (which is a many to many connector) table: problem_tags
which excerpt of them is like:
Problem table:
----+------------------------------------+--------
id | name | rating
----+------------------------------------+--------
1 | Special Permutation | 1600
2 | Binary String Reconstruction | 1500
3 | Special Elements | 1500
4 | Alice, Bob and Candies | 1300
5 | K-th Not Divisible by n | 1200
6 | Same Parity Summands | 1200
7 | Sum of Round Numbers | 800
8 | Skier | 1400
9 | Square? | 900
Tag table:
id | name
----+---------------------------
1 | constructive algorithms
2 | dfs and similar
3 | math
4 | brute force
5 | implementation
6 | two pointers
7 | binary search
8 | data structures
Problem tags table:
problem_id | tag_id
------------+--------
1 | 1
2 | 1
2 | 2
2 | 3
3 | 4
3 | 5
3 | 6
4 | 5
5 | 3
5 | 7
My question is how can I filter out problems based on multiple tags, i.e. all problems that are tagged math and binary search and brute force; or all problems that are tagged math but not constructive algorithms; or for a more complex one all problems that are only tagged with math and implementation and nothing else?
Currently I've come up with something like this:
- Find all problem's ids that are tagged math (joining tag and problem_tags table)
- Find all problem's ids that are tagged binary search
- Find all problem's ids that are tagged brute force
- Get intersection of all above ids
- Select problems where their ids is in the above intersection
But my solution lacks when it reaches the second example (only tagged with selected tags) and I think it's not the most optimal and SQL-ish way for doing it.
Best Answer
I had just asked that on a different forum. The best solution for a 3rd Normal Form schema design was to select all that is needed in the linking table and compare the Cardinality (count) of the matching results.
The other forum was for Oracle. The answer you seek is the PostgreSQL version of this post: https://community.oracle.com/message/15613817#15613817
Untested (needs to be fixed)
For the 2nd version, add
AND p.id NOT IN
clause.For the 3rd version add a filter that ensures
pt.problem_id
has the exact cardinality of the queried tags.