Mysql – what could make examined_rows change radically

MySQLmysql-5.6

I have a mysql-based application which sometimes takes forever to execute a select statement. When this happens, the statement appears naturally in my slow query log:

# Query_time: 51.826420  Lock_time: 0.000143 Rows_sent: 3  Rows_examined: 15574557

Rows examined above tells me that a full table scan occured. However, normally, the same query examines at least one order of magnitude less rows. Namely, performance_schema.events_statements_history tells me exactly 982937. Naturally, the execution time for this query is one order of magnitude smaller in this case.

I checked that explain on this query appears to give the same result in both cases. I also checked that if I add a USE INDEX statement when this problem occurs, the execution time goes down back to normal but, normally, I do not need to add the USE INDEX.

One of the queries that exhibit this behavior is shown below:

SELECT     
     COUNT(DISTINCT(t.id)),
     t.tag_3 AS group_0 
FROM
     tasks AS t
WHERE
         t.message_time >= FROM_UNIXTIME(1486508400)
     AND t.message_time  < FROM_UNIXTIME(1487113200)
     AND (   (t.type = 12 AND t.site_id = 172)
          OR (t.type = 1  AND t.site_id = 172)
          OR (t.type = 8  AND t.site_id = 173) )
     AND t.tag_1 IN (74,75,76,77,78,79,80,81,99,263)
     AND t.tag_3 IN (302,303,305)
     AND t.tag_4 IN (315,316,317,318,319,320,321,322,323,351,352,357)
     AND t.site_id IN (172,173)
GROUP BY
     group_0;

The explain output:

+----+-------------+-------+-------+-----------------------------------------------------------------------+----------+---------+------+-------+----------------------------------------------------+
| id | select_type | table | type  | possible_keys                                                         | key      | key_len | ref  | rows  | Extra                                              |
+----+-------------+-------+-------+-----------------------------------------------------------------------+----------+---------+------+-------+----------------------------------------------------+
|  1 | SIMPLE      | t     | range | IX_site_id,IX_site_id_type,IX_tag_1,IX_tag_3,IX_tag_4,IX_message_time | IX_tag_4 | 5       | NULL | 24732 | Using index condition; Using where; Using filesort |
+----+-------------+-------+-------+-----------------------------------------------------------------------+----------+---------+------+-------+----------------------------------------------------+

All tables use innodb engines.

What could trigger such different behaviours at different points in time ? Namely, what could have caused a full table scan ?

Best Answer

What is happening -- The Optimizer picked a different way to perform the query.

You are at the whim of

  • The Optimizer -- It gets better with each new version.
  • The statistics -- ANALYZE TABLE may help, or could hurt. (OPTIMIZE is overkill, and ends with ANALYZE; "never" use it.)
  • The constants in the query -- May lead the optimizer to pick a different explain plan.
  • The "phase of the moon", as I call this combination of whims.

If id is the PRIMARY KEY and there is no JOIN, you are asking for a lot of extra work: COUNT(DISTINCT(t.id)),; switch to simply COUNT(*).

When a query is run, some EXPLAIN plan is generated.
When you use EXPLAIN, a plan is generated.
There is no guarantee that they match. (This is solved only in the latest version, wherein the EXPLAIN is optionally included in the slowlog.

Any single-column index on any of these may be best; there is no way to know without a lot more info on the distribution of the data: tag_1, tag_3, tag_4, site_id, message_time. I would have several 2-col "composite" indexes starting with each and ending with message_time. This way, if the query actually says foo IN (just-one-item), it can move past foo and also use message_time.

Before discussing further, please add SHOW CREATE TABLE.