Mysql – 1 hour query, is there another way to do this

join;MySQLperformancequery-performance

I saw a couple questions similar to this but none could answer my question, really.

I have two tables, and I am trying to find those rows where a match is found.
user has about 350,000 entries and match has about 55,000 entries. There are UNIQUE indices on both user.name as well as on match.string.

My query is:

SELECT user.name
FROM user
INNER JOIN match
on ( 
      (match.type = 'exact' AND user.name = match.string)
   OR (match.type = 'begin' AND user.name LIKE CONCAT(match.string, '%'))
   OR (match.type = 'end' AND user.name LIKE CONCAT('%', match.string))
)

The query is running for an hour now, and no end in sight.
Is there a faster way to do that?

I don't need this for regular production, but I think a query that runs for so long is unacceptable.

Best Answer

No way! Every 'end' is not optimizable because of the leading wildcard. That means that if there are, say, a mere 3K 'end' entries in match, then there will be over a billion (350K * 3K) tests to perform!

The query can be partially optimized by

INDEX(type, string) -- in `match`
INDEX(name) -- in `user`
SELECT ... FROM user JOIN match ON match.type='exact' AND user.name=match.string
UNION ALL
SELECT ... FROM user JOIN match ON match.type='begin' AND user.name LIKE CONCAT...
UNION ALL
SELECT ... FROM user JOIN match ON match.type='end' AND user.name LIKE CONCAT...;

The match INDEX minimizes the effort in finding the relevant rows.
The user INDEX is useful for the first two SELECTs.
The UNION lets the optimizer operate on each SELECT independently.
The ALL avoid a de-dupping phase that is (I assume) unnecessary.

Have you considered a FULLTEXT index? If it works, it will be a million times as fast. (Perhaps literally a million!)

Here's a kludge that can make the third SELECT sargable (gee, I learned a new word!) -- Reverse both the match.string and the user.name, then do the LIKE like 'begin'. No, don't use the REVERSE function in this query, actually build the two tables (or maybe a separate pair of tables) with the reversed strings.

Best Answer

Related Solutions

MySQL Query Optimization – Indexes & Joins 12 million+ records

SQL Server – Should a Subquery Be Used to Help Find the Correct Plan?

Related Question