MySQL – Ordering Table by Primary Key

mariadb-10.2MySQLmysql-5.5order-byprimary-key

I have a table with a primary key consisting of two integer columns.
I store ~200 millions lines in this table.
My program needs to get the full table ordered by the columns of the primary key in the same order.

So, my query looks like Select * from MYTABLE order by PK1,PK2;

I noticed that instead of sending data immediately, the query spend a lot of time in the "sorting result" state.
I would have thought that mysql would use the primary key index to access the data directly in the required order. But, it seems it scans it without order and then sorts the data afterward.

Is it possible to change the behavior and make the query more efficient ?

Remark : I have tried using mysql 5.5 and mariadb10.2.

Here are some (I Hope) useful information (I have tried to EXPLAIN some orders using indexed and non-indexed columns and all give the same result)

mysql> show indexes from reponse;
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name    | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| reponse |          0 | PRIMARY     |            1 | objet_id    | A         |      502989 |     NULL | NULL   |      | BTREE      |         |               |
| reponse |          0 | PRIMARY     |            2 | question_id | A         |   189627157 |     NULL | NULL   |      | BTREE      |         |               |
| reponse |          1 | idxQuestion |            1 | question_id | A         |        5340 |     NULL | NULL   |      | BTREE      |         |               |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0,30 sec)

mysql> explain select * from reponse order by objet_id,question_id;
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
| id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows      | Extra          |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
|    1 | SIMPLE      | reponse | ALL  | NULL          | NULL | NULL    | NULL | 189627158 | Using filesort |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
1 row in set (0,23 sec)

mysql> explain select * from reponse order by question_id,objet_id;
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
| id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows      | Extra          |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
|    1 | SIMPLE      | reponse | ALL  | NULL          | NULL | NULL    | NULL | 189627159 | Using filesort |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
1 row in set (0,44 sec)

mysql> explain select * from reponse order by objet_id,question_id;
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
| id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows      | Extra          |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
|    1 | SIMPLE      | reponse | ALL  | NULL          | NULL | NULL    | NULL | 189627159 | Using filesort |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
1 row in set (0,11 sec)

mysql> explain select * from reponse order by n_pose0;
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
| id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows      | Extra          |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
|    1 | SIMPLE      | reponse | ALL  | NULL          | NULL | NULL    | NULL | 189627175 | Using filesort |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
1 row in set (0,26 sec)

mysql> explain select * from reponse order by question_id;
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
| id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows      | Extra          |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
|    1 | SIMPLE      | reponse | ALL  | NULL          | NULL | NULL    | NULL | 189627421 | Using filesort |
+------+-------------+---------+------+---------------+------+---------+------+-----------+----------------+
1 row in set (0,12 sec)

mysql> select version();
+--------------------------------------+
| version()                            |
+--------------------------------------+
| 10.2.12-MariaDB-10.2.12+maria~jessie |
+--------------------------------------+

The table REPONSE was created like this :

    CREATE TABLE `reponse` (
  `objet_id` int(11) NOT NULL,
  `question_id` int(11) NOT NULL,
  `n_pose0` int(11) NOT NULL,
  `n_pose1` int(11) NOT NULL,
  `n_pose2` int(11) NOT NULL,
  `n_pose3` int(11) NOT NULL,
  `n_pose4` int(11) NOT NULL,
  PRIMARY KEY (`objet_id`,`question_id`),
  KEY `idxQuestion` (`question_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

In complement to the given answer, you can add an hint for forcing the engine to use an index. I tested it and the response time has doubled… It seems the optimizer did a good job this time :

date;mysql --quick -e 'select * from reponse order by objet_id,question_id' > /dev/null;date
Tue Nov 20 15:49:22 CET 2018
Tue Nov 20 15:52:19 CET 2018
date;mysql --quick -e 'select * from reponse force index(PRIMARY) order by objet_id,question_id' > /dev/null;date
Tue Nov 20 15:52:45 CET 2018
Tue Nov 20 15:58:23 CET 2018

Best Answer

ENGINE=MyISAM, that's the reason for the chosen query plan.

If the table was using the InnoDB engine, then the primary key index would be the clustered key of the table, so it wouldn't have to do sorting after reading it, as the query's ORDER BY matches the primary key.

With MyISAM it has two options: to read the index and the (heap) table (without sorting) or only the table and do a sort. It chooses the second because the optimizer thinks it's faster.