Mysql – MariaDB (MySQL) slow query when primary key range combined with fulltext index

full-text-searchindex-tuningMySQLquery-performance

I've a table described below, with two columns – integer primary key and title text – currently holding circa 3 million records. As seen in the metadata below, there's a BTREE index on integer primary key column, and FULLTEXT index on title column.

MariaDB [ttsdata]> describe records;
+------------------+---------------------+------+-----+---------------------+-------------------------------+
| Field            | Type                | Null | Key | Default             | Extra                         |
+------------------+---------------------+------+-----+---------------------+-------------------------------+
| id               | int(15) unsigned    | NO   | PRI | NULL                | auto_increment                |
| title            | varchar(2000)       | YES  | MUL |                     |                               |
+------------------+---------------------+------+-----+---------------------+-------------------------------+

MariaDB [ttsada]> show index from records;
+---------+------------+-------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name                | Seq_in_index | Column_name      | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| records |          0 | PRIMARY                 |            1 | id               | A         |     2798873 |     NULL | NULL   |      | BTREE      |         |               |
| records |          1 | title                   |            1 | title            | NULL      |           1 |     NULL | NULL   | YES  | FULLTEXT   |         |               |
+---------+------------+-------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

I'd like to run the following query:

SELECT * FROM records
WHERE
  id > 2589968 AND
  MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)
LIMIT 200

This query takes more 5 seconds to execute. When I remove the the range part or the fulltext part, in both cases the query executes in circa 100 ms. Below is analysis of individual queries, the last one being the one I want to use.

So my question is: How can I make the query fast?

Below I've posted statistics for each separate condition and for combined conditions. Each statistics contains:

  • EXPLAIN
  • Execution time with SQL_NO_CACHE
  • Number of matched rows

I'm new to MySQL and DBA in general. I don't know how to draw any conclusions from it.

The 2589968 magic number is something that just happens to be needed.

Query 1

SELECT * FROM records
WHERE id > 2589968
LIMIT 200
MariaDB [ttsdata]> explain SELECT * FROM records WHERE id > 2589968 LIMIT 200;
+------+-------------+---------+-------+---------------+---------+---------+------+--------+-----------------------+
| id   | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | Extra                 |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+-----------------------+
|    1 | SIMPLE      | records | range | PRIMARY       | PRIMARY | 4       | NULL | 227183 | Using index condition |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+-----------------------+
1 row in set (0.005 sec)

MariaDB [ttsdata]> SELECT SQL_NO_CACHE * FROM records WHERE id > 2589968 LIMIT 200;
...
200 rows in set (0.108 sec)

MariaDB [ttsdata]> SELECT count(*) FROM records WHERE id > 2589968;
+----------+
| count(*) |
+----------+
|   208905 |
+----------+
1 row in set (0.257 sec)

Query time: 0.108 sec

Query 2

SELECT * FROM records
WHERE MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)
LIMIT 200
MariaDB [ttsdata]> explain SELECT * FROM records WHERE MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+----------+---------------+-------+---------+------+------+-------------+
| id   | select_type | table   | type     | possible_keys | key   | key_len | ref  | rows | Extra       |
+------+-------------+---------+----------+---------------+-------+---------+------+------+-------------+
|    1 | SIMPLE      | records | fulltext | title         | title | 0       |      | 1    | Using where |
+------+-------------+---------+----------+---------------+-------+---------+------+------+-------------+
1 row in set (0.007 sec)

MariaDB [ttsdata]> SELECT SQL_NO_CACHE * FROM records WHERE MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
...
200 rows in set (0.138 sec)

MariaDB [ttsdata]> SELECT count(*) FROM records WHERE MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE);
+----------+
| count(*) |
+----------+
|  1846710 |
+----------+
1 row in set (6.056 sec)

Query time: 0.138 sec

Query 3

SELECT * FROM records
WHERE
  id > 2589968 AND
  MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)
LIMIT 200
MariaDB [ttsdata]> explain SELECT * FROM records WHERE id > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+----------+---------------+-------+---------+------+------+-------------+
| id   | select_type | table   | type     | possible_keys | key   | key_len | ref  | rows | Extra       |
+------+-------------+---------+----------+---------------+-------+---------+------+------+-------------+
|    1 | SIMPLE      | records | fulltext | PRIMARY,title | title | 0       |      | 1    | Using where |
+------+-------------+---------+----------+---------------+-------+---------+------+------+-------------+
1 row in set (0.005 sec)

MariaDB [ttsdata]> SELECT SQL_NO_CACHE * FROM records WHERE id > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
...
200 rows in set (5.627 sec)

MariaDB [ttsdata]> SELECT count(*) FROM records WHERE id > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+----------+
| count(*) |
+----------+
|    91026 |
+----------+
1 row in set (6.817 sec)

Query time: 5.627 sec

Analyze statement on the queries

WHERE id > 2589968
MariaDB [ttsdata]> analyze SELECT * FROM records WHERE id > 2589968 LIMIT 200;
+------+-------------+---------+-------+---------------+---------+---------+------+--------+--------+----------+------------+--------------------------+
| id   | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | r_rows | filtered | r_filtered | Extra                    |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+--------+----------+------------+--------------------------+
|    1 | SIMPLE      | records | range | PRIMARY       | PRIMARY | 4       | NULL | 227210 | 200.00 |   100.00 |     100.00 | Using where; Using index |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+--------+----------+------------+--------------------------+
1 row in set (0.007 sec)
MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)
MariaDB [ttsdata]> analyze SELECT * FROM records WHERE MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+----------+---------------+-----------+---------+------+------+--------+----------+------------+-------------+
| id   | select_type | table   | type     | possible_keys | key       | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
+------+-------------+---------+----------+---------------+-----------+---------+------+------+--------+----------+------------+-------------+
|    1 | SIMPLE      | records | fulltext | title         | title     | 0       |      | 1    | 200.00 |   100.00 |     100.00 | Using where |
+------+-------------+---------+----------+---------------+-----------+---------+------+------+--------+----------+------------+-------------+
1 row in set (0.007 sec)
WHERE id > 2589968 AND
MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) 
MariaDB [ttsdata]> analyze SELECT id FROM records WHERE id > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+----------+---------------+-------+---------+------+------+------------+----------+------------+-------------+
| id   | select_type | table   | type     | possible_keys | key   | key_len | ref  | rows | r_rows     | filtered | r_filtered | Extra       |
+------+-------------+---------+----------+---------------+-------+---------+------+------+------------+----------+------------+-------------+
|    1 | SIMPLE      | records | fulltext | PRIMARY,title | title | 0       |      | 1    | 1755861.00 |     8.12 |       0.01 | Using where |
+------+-------------+---------+----------+---------------+-------+---------+------+------+------------+----------+------------+-------------+
1 row in set (5.768 sec)

EXPLAIN and ANALYZE with index hints

No index hint

Query:

SELECT * FROM records
WHERE
  ID > 2589968
  AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)
LIMIT 200

The same results are reported for USE INDEX (PRIMARY, title) index hint.

Result:

MariaDB [ttsdata]> explain SELECT * FROM records WHERE ID > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+----------+---------------+-----------+---------+------+------+-------------+
| id   | select_type | table   | type     | possible_keys | key       | key_len | ref  | rows | Extra       |
+------+-------------+---------+----------+---------------+-----------+---------+------+------+-------------+
|    1 | SIMPLE      | records | fulltext | PRIMARY,title | title     | 0       |      | 1    | Using where |
+------+-------------+---------+----------+---------------+-----------+---------+------+------+-------------+
1 row in set (0.009 sec)

MariaDB [ttsdata]> analyze SELECT * FROM records WHERE ID > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+----------+---------------+-------+---------+------+------+------------+----------+------------+-------------+
| id   | select_type | table   | type     | possible_keys | key   | key_len | ref  | rows | r_rows     | filtered | r_filtered | Extra       |
+------+-------------+---------+----------+---------------+-------+---------+------+------+------------+----------+------------+-------------+
|    1 | SIMPLE      | records | fulltext | PRIMARY,title | title | 0       |      | 1    | 1755861.00 |     8.12 |       0.01 | Using where |
+------+-------------+---------+----------+---------------+-------+---------+------+------+------------+----------+------------+-------------+
1 row in set (5.763 sec)

USE INDEX(PRIMARY)

Query:

SELECT * FROM records USE INDEX (PRIMARY)
WHERE
  ID > 2589968
  AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)
LIMIT 200

Results:

MariaDB [ttsdata]> explain SELECT * FROM records USE INDEX (PRIMARY) WHERE ID > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+-------+---------------+---------+---------+------+--------+------------------------------------+
| id   | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | Extra                              |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+------------------------------------+
|    1 | SIMPLE      | records | range | PRIMARY       | PRIMARY | 4       | NULL | 227210 | Using index condition; Using where |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+------------------------------------+
1 row in set (0.006 sec)

MariaDB [ttsdata]> analyze SELECT * FROM records USE INDEX (PRIMARY) WHERE ID > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE) LIMIT 200;
+------+-------------+---------+-------+---------------+---------+---------+------+--------+--------+----------+------------+------------------------------------+
| id   | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | r_rows | filtered | r_filtered | Extra                              |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+--------+----------+------------+------------------------------------+
|    1 | SIMPLE      | records | range | PRIMARY       | PRIMARY | 4       | NULL | 227210 | 930.00 |   100.00 |      21.51 | Using index condition; Using where |
+------+-------------+---------+-------+---------------+---------+---------+------+--------+--------+----------+------------+------------------------------------+
1 row in set (0.014 sec)

Best Answer

You have:

  • 208905 rows WHERE id > 2589968
  • 1846710 WHERE MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)
  • 91026 WHERE id > 2589968 AND MATCH (title) AGAINST ('+flower' IN BOOLEAN MODE)

It means ~43% of rows with id > 2589968 contain 'flower' in their title.

For some reason query optimizer decides to use fulltext index to start. Since fulltext index doesn't contain id column server needs to perform a lot of lookups to the table to check id condition. Only ~5% of rows are matched the id condition and I suppose most of them are located "at the end" of fulltext index because you are looking for those ones which were added later (id column is auto_increment). So server performs a lot of unsuccessful lookups and for your conditions it would be faster just to start reading of your clustered index (which is primary key) from id > 2589968 and check each record for fulltext condition (4 of 10 should give the successful result) and finish after 200 matches.

What also you can try to do to make performance better?
Rebuild your table using column FTS_DOC_ID BIGINT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY instead of id. This is built-in MySQL name for column which should contain document id used as reference in fulltext index. I'm not sure if it will work the same way in MariaDB and actually I didn't test it with MySQL but it may help server to avoid a lot of extra lookups.