I use MySQL 5.6 and 5.7 on Ubuntu 16.04.3 LTS
and I have a problem with the following query (which is derived from drupal 7 but have simplified it in order to pinpoint the issue):
select distinct node.nid, node.title, users.name
from node
left join users on node.uid = users.uid
order by node.nid desc
limit 4;
My servers are provisioned with the exact same ansible script which uses the following my.cnf
:
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
bind-address = 0.0.0.0
key_buffer_size = 16M
max_allowed_packet = 1G
thread_stack = 192K
thread_cache_size = 8
query_cache_type = 0
slow_launch_time=1
innodb_buffer_pool_size = 1G
innodb_file_per_table
open-files-limit=10000
log_error = /var/log/mysql/error.log
long_query_time=2
slow_query_log=on
slow_query_log_file=/var/log/mysql/mysql-slow-queries.log
general_log=off
general_log_file=/var/log/mysql/general.log
innodb_flush_log_at_trx_commit=0
I import my database and then connect to CLI using:
echo "create database drupal" | mysql -u root -ppass
mysql -u root -ppass drupal < /vagrant/2017-10-13--08-00-backup-xxxx-db.sql
mysql -u root -ppass drupal -A
On servers with 5.6 (show variables like 'version';
gives me 5.6.16-1~exp1
) the query returns the following rows which are in the correct descending order (note: used some x
characters to anonymize dataset):
+-------+------------------------------------------+------+
| nid | title | name |
+-------+------------------------------------------+------+
| 56217 | A Coxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | ztex |
| 56216 | Estxxxxxxxxx | ctex |
| 56215 | Plexxxxxxxxxxxxxx | xovr |
| 56214 | Cloxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | xovr |
+-------+------------------------------------------+------+
But in servers with 5.7 (show variables like 'version';
gives me 5.7.20-0ubuntu0.16.04.1-log
) I get the opposite nid
order from what I expect:
+-----+---------------------------+------+
| nid | title | name |
+-----+---------------------------+------+
| 1 | AEGQ todo - January 2009 | xc |
| 2 | AEGQ todo - February 2009 | xc |
| 3 | AEGQ todo - March 2009 | xc |
| 4 | AEGQ todo - May 2009 | xc |
+-----+---------------------------+------+
In both cases the explain extended
and show warnings
outputs look the same.
5.6:
mysql> explain extended select distinct node.nid, node.title, users.name from node left join users on node.uid = users.uid order by node.nid desc limit 4;
+----+-------------+-------+--------+---------------+---------+---------+-----------------+------+------------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+------+------------+-----------------+
| 1 | SIMPLE | node | index | NULL | PRIMARY | 4 | NULL | 4 | 1359550.00 | Using temporary |
| 1 | SIMPLE | users | eq_ref | PRIMARY | PRIMARY | 4 | drupal.node.uid | 1 | 100.00 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+------+------------+-----------------+
2 rows in set, 1 warning (0.00 sec)
mysql> show warnings\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select distinct `drupal`.`node`.`nid` AS `nid`,`drupal`.`node`.`title` AS `title`,`drupal`.`users`.`name` AS `name` from `drupal`.`node` left join `drupal`.`users` on((`drupal`.`node`.`uid` = `drupal`.`users`.`uid`)) where 1 order by `drupal`.`node`.`nid` desc limit 4
1 row in set (0.00 sec)
5.7:
mysql> explain extended select distinct node.nid, node.title, users.name from node left join users on node.uid = users.uid order by node.nid desc limit 4;
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+-----------------+
| 1 | SIMPLE | node | NULL | index | NULL | PRIMARY | 4 | NULL | 4 | 100.00 | Using temporary |
| 1 | SIMPLE | users | NULL | eq_ref | PRIMARY | PRIMARY | 4 | drupal.node.uid | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+-----------------+
2 rows in set, 2 warnings (0.00 sec)
mysql> show warnings\G
*************************** 1. row ***************************
Level: Warning
Code: 1681
Message: 'EXTENDED' is deprecated and will be removed in a future release.
*************************** 2. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select distinct `drupal`.`node`.`nid` AS `nid`,`drupal`.`node`.`title` AS `title`,`drupal`.`users`.`name` AS `name` from `drupal`.`node` left join `drupal`.`users` on((`drupal`.`node`.`uid` = `drupal`.`users`.`uid`)) where 1 order by `drupal`.`node`.`nid` desc limit 4
2 rows in set (0.00 sec)
On both servers show variables like 'coll%';
returns the same:
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
And of course, on both servers the schema and data are the same:
| node | CREATE TABLE `node` (
`nid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`vid` int(10) unsigned DEFAULT NULL,
`type` varchar(32) NOT NULL DEFAULT '',
`language` varchar(12) NOT NULL DEFAULT '',
`title` varchar(255) NOT NULL DEFAULT '',
`uid` int(11) NOT NULL DEFAULT '0',
`status` int(11) NOT NULL DEFAULT '1',
`created` int(11) NOT NULL DEFAULT '0',
`changed` int(11) NOT NULL DEFAULT '0',
`comment` int(11) NOT NULL DEFAULT '0',
`promote` int(11) NOT NULL DEFAULT '0',
`sticky` int(11) NOT NULL DEFAULT '0',
`tnid` int(10) unsigned NOT NULL DEFAULT '0',
`translate` int(11) NOT NULL DEFAULT '0',
`rh_action` int(11) DEFAULT NULL,
`rh_redirect` longtext,
`rh_redirect_response` int(11) DEFAULT NULL,
PRIMARY KEY (`nid`),
UNIQUE KEY `vid` (`vid`),
KEY `node_changed` (`changed`),
KEY `node_created` (`created`),
KEY `node_frontpage` (`promote`,`status`,`sticky`,`created`),
KEY `node_status_type` (`status`,`type`,`nid`),
KEY `node_title_type` (`title`,`type`(4)),
KEY `node_type` (`type`(4)),
KEY `uid` (`uid`),
KEY `tnid` (`tnid`),
KEY `translate` (`translate`),
KEY `language` (`language`)
) ENGINE=InnoDB AUTO_INCREMENT=56997 DEFAULT CHARSET=utf8 COMMENT='The base table for nodes.' |
and
| users | CREATE TABLE `users` (
`uid` int(10) unsigned NOT NULL DEFAULT '0',
`name` varchar(60) NOT NULL DEFAULT '',
`pass` varchar(128) NOT NULL DEFAULT '',
`mail` varchar(254) DEFAULT '',
`theme` varchar(255) NOT NULL DEFAULT '',
`signature` varchar(255) NOT NULL DEFAULT '',
`signature_format` varchar(255) DEFAULT NULL,
`created` int(11) NOT NULL DEFAULT '0',
`access` int(11) NOT NULL DEFAULT '0',
`login` int(11) NOT NULL DEFAULT '0',
`status` tinyint(4) NOT NULL DEFAULT '0',
`timezone` varchar(32) DEFAULT NULL,
`language` varchar(12) NOT NULL DEFAULT '',
`picture` int(11) NOT NULL DEFAULT '0',
`init` varchar(254) DEFAULT '',
`data` longblob,
PRIMARY KEY (`uid`),
UNIQUE KEY `name` (`name`),
KEY `access` (`access`),
KEY `created` (`created`),
KEY `mail` (`mail`),
KEY `picture` (`picture`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Stores user data.' |
To recap, what is the same:
- O/S, kernel, software components and underlying virtualization method
/etc/mysql/my.cnf
- imported database file (schema + data are the same)
- database user
- select query
- explain select output
- server, database and connection collations
What is different:
- mysql version
sql_mode
. As shown in mymy.cnf
I've left defaults for 5.6 and 5.7 which are:-
- 5.6:
NO_ENGINE_SUBSTITUTION
- 5.6:
-
- 5.7:
ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
- 5.7:
Problem:
- order of results looks wrong on mysql 5.7
How can this be explained? Have the semantics of distinct keyword changed in MySQL 5.7?
Best Answer
We don't get to say this often but in this case:
If this happens, it's a bug.
A query that has:
should apply the
ORDER BY
andLIMIT
after theSELECT
- whether it's aSELECT DISTINCT
or not - and return the rows with the highest fournid
values in descending order. In the 5.7.20 case, it seems that it doesn't and theORDER BY
is ignored (or it runs asASC
), and the rows are returned with ascending order, so it must be a bug.The right thing to do in these cases, is to report the bug with a sample test case, to MySQL bug list.
Things you could also do - to confirm, make it easier for MySQL developers to fix, and remedy the issue until the bug is fixed - is to try:
DISTINCT
, seems useless anyway. Thenode.nid
is unique since it's the primary key and the join is onuser.uid
which is the primary key of the other table, so the distinct will have no effect.DISTINCT
and addGROUP BY node.nid, node.title, users.name
or justGROUP BY node.nid, users.name
.ONLY_FULL_GROUP_BY
setting.And please include the findings (if the bug persists with the above tries) in your bug report to MySQL - to help them identify the cause of the problem. I suspect it's a regression due to some optimization improvement related to
GROUP BY
.