Mysql – A MySQL EXPLAIN number of rows discrepancy

explainMySQLperformanceperformance-tuning

MySQL 5.5.49-log

More questions on the query in Why does it use temporary? (MySQL) (the query is the same but the question is different):

I have the following table (filled with many rows):

CREATE TABLE `SectorGraphs2` (
  `Kind` tinyint(3) UNSIGNED NOT NULL COMMENT '1 - продюсер, 2 - жанр, 3 - регион',
  `Criterion` tinyint(3) UNSIGNED NOT NULL,
  `Period` tinyint(3) UNSIGNED NOT NULL,
  `PeriodStart` date NOT NULL,
  `SectorID` int(10) UNSIGNED NOT NULL,
  `Value` float NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

ALTER TABLE `SectorGraphs2`
  ADD UNIQUE KEY `Producer2` (`Kind`,`Criterion`,`Period`,`PeriodStart`,`SectorID`) USING BTREE,
  ADD KEY `SectorID` (`SectorID`);

then I run:

EXPLAIN 
    SELECT SectorID, SUM(Value)
    FROM SectorGraphs2
    WHERE Kind = 1 AND Criterion = 7
      AND Period = 1
      AND PeriodStart >= ? AND PeriodStart < ? + INTERVAL 1 WEEK
    GROUP BY SectorID

and it produces:

+----+-------------+---------------+-------+---------------+-----------+---------+------+------+----------------------------------------------+
| id | select_type | table         | type  | possible_keys | key       | key_len | ref  | rows | extra                                        |
+----+-------------+---------------+-------+---------------+-----------+---------+------+------+----------------------------------------------+
|  1 | SIMPLE      | SectorGraphs2 | range | Producer2     | Producer2 | 6       | NULL | 1    | Using where; Using temporary; Using filesort |
+----+-------------+---------------+-------+---------------+-----------+---------+------+------+----------------------------------------------+

See a nicely formatted explanation here.

My question: Why it is used a temporary table and filesort but it reports only 1 row examined? It seems that because of using a temporary table, it should process more than one row. How can I determine the real number of rows processed? How to solve this discrepancy about number of processed rows?

Note that the task I was assigned to do now is to eliminate heavy (involving too many rows) queries. And now I do not know how to do this.

Best Answer

Try to replace AND PeriodStart >= ? AND PeriodStart < ? + INTERVAL 1 WEEK clause with precalculated version:

SET @PeriodEnd = ? + INTERVAL 1 WEEK;
SELECT SectorID, SUM(Value)
    FROM SectorGraphs2
    WHERE Kind = 1 AND Criterion = 7
      AND Period = 1
      AND PeriodStart BETWEEN ? AND @PeriodEnd
    GROUP BY SectorID;

Explicitly point the index you want to use:

SET @PeriodEnd = ? + INTERVAL 1 WEEK;
SELECT SectorID, SUM(Value)
    FROM SectorGraphs2 USE INDEX (`Producer2`)
    WHERE Kind = 1 AND Criterion = 7
      AND Period = 1
      AND PeriodStart BETWEEN ? AND @PeriodEnd
    GROUP BY SectorID;

Reorder fields in the index definition placing more selective fields first. To determine selectivity run the next query:

SELECT COUNT( DISTINCT Criterion )
  FROM SectorGraphs2;

Then replace Criterion with Kind, PeriodStart, SectorID etc.

The higher count means better selectivity and you have to arrange fields in the index from the best selectivity to worst.