Mysql – What’s the best way to go about searching for short names

full-text-searchMySQL

I have a table that has CHAR in Korean for Soccer player names. All of these are under 12 characters, and while LIKE "%name%" works, Slight mistypes or variations don't work, and I've heard that LIKE isn't the fastest solution compared to something like Fulltext Indexes. I've also heard that the NGRAM parser works best with Korean/Japanese/Chinese characters, but it requires that I put in a title/body column; I only want to search for names, nothing else.

I currently have ~30k rows, and it won't grow more than ~45k. What's the best way to go about this?

Here's what I tried (after making an word index table in ngram):

...MATCH (name) AGAINST ('foo' IN NATURAL LANGUAGE MODE);

but it's not as precise as using LIKE, since it returns any word that's remotely connected to the query word.

Best Answer

A trick that sometimes is useful...

  • MATCH is usually the fastest way to search
  • MATCH has some drawbacks
  • LIKE and RLIKE are very slow
  • When using both MATCH and LIKE, MATCH is done first, thereby providing very few rows to recheck with LIKE:

    WHERE MATCH(...) AGAINST(...)
      AND ... LIKE '%...%'
    

(I am not familiar with how NGRAM works, so I can't provide specifics there.)

From the 5.7.23 and 8.0.12 Changelogs: "The ngram full-text search parser permitted comma and period characters to be tokenized as words, which caused an inconsistency between boolean and natural language mode search results. Comma and period characters are no longer tokenized." (This implies that 5.7.23 has a fix for it.)