I'm trying to determine the best way to find all posts that have an exact match with a list of user tags. For example, user has tag IDs: 1,5,9, and 11, while post has tag IDs 1, 5 and 9, thus this is selected as a valid post to display to the user.
The only way I could think of to do this is to count the number of unique tags for each post then count the number of tags a user has in common with the post, compare two values, if equal then it is a match (see query below). Is this truly the best way?
Schema:
Tags
Posts
Posts_tags (tag_id, post_id) unique enforced
Users
Users_tags (tag_id, user_id) unique enforced
Current query (broken down so more legible):
Sub-Query #1:
SELECT COUNT(*) as numrows, pt.post_id AS id
FROM users_tags AS ut
INNER JOIN posts_tags AS pt
ON ut.tag_id = pt.tag_id
WHERE ut.user_id = 1
GROUP BY pt.post_id
Sub-Query #2:
SELECT COUNT(*) AS numrows, post_id AS id
FROM posts_tags
GROUP BY post_id
Master Query:
SELECT t1.id, t2.id
FROM (Sub-Query #1) t1
INNER JOIN (Sub-Query #2) t2
ON (t1.numrows = t2.numrows)
AND (t1.id = t2.id)
Also, would a graph database like neo4j be better at this type of query?
Best Answer
If you are using MySQL 5.6 or later, then the query will be reasonably efficient. If using an older version, it is "order N*N", which is very slow for large subqueries.