We have recently migrated from MySQL 5.5 to 5.6 (5.6.44-86.0-log Percona Server
to be precise) and we have a huge problem. Seemingly simple queries run for minutes instead of milliseconds on a table with 300+ million records.
This is the table in question:
CREATE TABLE `articles` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`feed_id` int(11) unsigned NOT NULL,
`date` double(16,6) NOT NULL,
`score` mediumint(8) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `feed_id_date` (`feed_id`,`date`),
KEY `date` (`date`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
Yes, date
is double
because we need microsecond precision. One solution we haven't yet explored is to turn it into bigint
.
Here is a typical query that fails:
mysql> EXPLAIN SELECT a.id FROM articles a WHERE a.feed_id IN (6826,6827) AND a.date < 1564469723.424363 ORDER BY a.date DESC LIMIT 20;
+----+-------------+-------+-------+-------------------+--------------+---------+------+-----------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------+--------------+---------+------+-----------+------------------------------------------+
| 1 | SIMPLE | a | index | feed_id_date,date | feed_id_date | 12 | NULL | 339355570 | Using where; Using index; Using filesort |
+----+-------------+-------+-------+-------------------+--------------+---------+------+-----------+------------------------------------------+
1 row in set (0.00 sec)
Note the number of rows. It's practically a full table scan. This query runs for 3-5 minutes, which is unacceptable for an OLTP load. Now see the following two queries where we select the two feed_id
in question one by one:
mysql> EXPLAIN SELECT a.id FROM articles a WHERE a.feed_id IN (6826) AND a.date < 1564469723.424363 ORDER BY a.date DESC LIMIT 20;
+----+-------------+-------+-------+-------------------+--------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------+--------------+---------+------+------+--------------------------+
| 1 | SIMPLE | a | range | feed_id_date,date | feed_id_date | 12 | NULL | 1 | Using where; Using index |
+----+-------------+-------+-------+-------------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
mysql> EXPLAIN SELECT a.id FROM articles a WHERE a.feed_id IN (6827) AND a.date < 1564469723.424363 ORDER BY a.date DESC LIMIT 20;
+----+-------------+-------+-------+-------------------+--------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------+--------------+---------+------+------+--------------------------+
| 1 | SIMPLE | a | range | feed_id_date,date | feed_id_date | 12 | NULL | 473 | Using where; Using index |
+----+-------------+-------+-------+-------------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
And indeed this is the correct number of rows for each feed_id
:
mysql> SELECT COUNT(*) FROM articles a WHERE a.feed_id=6826;
+----------+
| COUNT(*) |
+----------+
| 1 |
+----------+
1 row in set (0.00 sec)
mysql> SELECT COUNT(*) FROM articles a WHERE a.feed_id=6827;
+----------+
| COUNT(*) |
+----------+
| 474 |
+----------+
1 row in set (0.00 sec)
This is very puzzling to me. We have all kinds of combinations in the IN
clause. Some users have 1000+ feed_id
records that run OK, but in some situations two records in the IN
clause are enough to cause a full table scan.
In the EXPLAIN
it's visible that the problem only arises when the type
is index
, and not range
. On our old MySQL 5.5 instance EXPLAIN
on the same queries and the same dataset always shows range
type and we never had such issue.
Could this be config related? Here's my my.cnf
:
[mysqld]
skip-external-locking
skip-name-resolve
transaction-isolation = READ-COMMITTED
max_connections = 5000
max_user_connections = 4500
back_log = 2048
max_allowed_packet = 128M
sort_buffer_size = 256K
read_buffer_size = 128K
read_rnd_buffer_size = 256K
join_buffer_size = 8M
myisam_sort_buffer_size = 8M
query_cache_limit = 1M
query_cache_size = 0
query_cache_type = 0
key_buffer = 10M
thread_stack = 256K
thread_cache_size = 100
tmp_table_size = 256M
max_heap_table_size = 1G
query_cache_min_res_unit = 1K
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2
innodb_buffer_pool_size = 46G
innodb_buffer_pool_instances = 32
innodb_log_file_size = 1G
innodb_log_buffer_size = 16M
innodb_file_per_table = 1
innodb_io_capacity = 50000
The VM has 64GB RAM and 100k+ iops storage, but this is surely not hardware related.
Best Answer
I fixed it by adding the
feed_id
field to theORDER BY
clause like that:Not sure if this is the correct way, looks a bit hacky to me, especially since this is not needed in
5.5
, so if anyone else has a better solution, let me know.