MySQL – Boolean Full-text Search

full-text-searchMySQL

I'm using MySQL's boolean full-text search for my search engine: https://dev.mysql.com/doc/refman/5.6/en/fulltext-boolean.html.

Assume I have a table with the following columns:

PRODUCTS
--------
id (PK)
color_id (FK to COLORS table)
name
description

When given a search term, I want to search the product's table's name and description columns, as well as the product's color name. The color's name is in a separate table.

Should I create a new column in the product's table and put all the text I want indexed in there? For example, I add a products.full_text_index column, and dump a concatenation of name, description, and the color's name in that column? Then do full-text search against that column?

UPDATE: Or, instead of a column in an existing table, maybe create a new table to contain all the text that should be used for the full-text search. That table can just have a FK to the products table.

Best Answer

Yes, that is a good way to use FULLTEXT, especially if you need to 'cleanse' the data in any way, or gather text from related tables.

There is probably not much difference between having the new column in the same table or in a separate table.

The redundancy is wasteful. However, there is no way to have any kind (FULLTEXT or other) of index that spans multiple tables. Might it be better to have all the columns in the same table, and do FULLTEXT(this, that, the_other)?

And the redundancy is potentially error-prone. This could be tempered by having a Stored Procedure that users must call when inserting or manipulating any of the text involved in the issue. This SP would always make the copy for you.

When using a FT index with other things, such as

WHERE price > 50
  AND MATCH(stuff) AGAINST(...)

the FT test will be done first. Hopefully, that will select only a small number of rows. Then the other tests will be applied to further filter the results. When the situation is reversed (price is more selective than MATCH), well, too bad.

If you always have the MATCH, there is no need for indexing price, the index won't be used.