I saw a couple questions similar to this but none could answer my question, really.
I have two tables, and I am trying to find those rows where a match is found.
user
has about 350,000 entries and match
has about 55,000 entries. There are UNIQUE
indices on both user.name
as well as on match.string
.
My query is:
SELECT user.name
FROM user
INNER JOIN match
on (
(match.type = 'exact' AND user.name = match.string)
OR (match.type = 'begin' AND user.name LIKE CONCAT(match.string, '%'))
OR (match.type = 'end' AND user.name LIKE CONCAT('%', match.string))
)
The query is running for an hour now, and no end in sight.
Is there a faster way to do that?
I don't need this for regular production, but I think a query that runs for so long is unacceptable.
Best Answer
No way! Every 'end' is not optimizable because of the leading wildcard. That means that if there are, say, a mere 3K 'end' entries in match, then there will be over a billion (350K * 3K) tests to perform!
The query can be partially optimized by
match
INDEX minimizes the effort in finding the relevant rows.user
INDEX is useful for the first two SELECTs.Have you considered a FULLTEXT index? If it works, it will be a million times as fast. (Perhaps literally a million!)
Here's a kludge that can make the third SELECT sargable (gee, I learned a new word!) -- Reverse both the match.string and the user.name, then do the LIKE like 'begin'. No, don't use the REVERSE function in this query, actually build the two tables (or maybe a separate pair of tables) with the reversed strings.