I have this schema with data:
CREATE TABLE `ticket` (
`id` INT NOT NULL,
`name` VARCHAR(45) NULL,
`type` INT(11) NULL,
PRIMARY KEY (`id`));
CREATE TABLE `task` (
`id` int(11) NOT NULL,
`name` varchar(45) DEFAULT NULL,
`ticket_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `fk_task_1_idx` (`ticket_id`),
CONSTRAINT `fk_task_1` FOREIGN KEY (`ticket_id`) REFERENCES `ticket` (`id`)
) ENGINE=InnoDB;
CREATE TABLE `planing` (
`id` int(11) NOT NULL,
`date` datetime DEFAULT NULL,
`task_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `fk_planing_1_idx` (`task_id`),
CONSTRAINT `fk_planing_1` FOREIGN KEY (`task_id`) REFERENCES `task` (`id`)
) ENGINE=InnoDB ;
INSERT INTO `ticket` (`id`, `name`, `type`) VALUES (1, 'ticket_1', 1);
INSERT INTO `task` (`id`, `name`, `ticket_id`) VALUES (1, 'task_1', 1);
INSERT INTO `task` (`id`, `name`, `ticket_id`) VALUES (2, 'task_2', 1);
INSERT INTO `planing` (`id`, `date`, `task_id`) VALUES (1, '2016-06-01 05:00', 2);
INSERT INTO `ticket` (`id`, `name`, `type`) VALUES (2, 'ticket_2', 1);
INSERT INTO `task` (`id`, `name`, `ticket_id`) VALUES (3, 'task_3', 2);
INSERT INTO `task` (`id`, `name`, `ticket_id`) VALUES (4, 'task_4', 2);
INSERT INTO `planing` (`id`, `date`, `task_id`) VALUES (2, '2016-06-01 05:00', 3);
INSERT INTO `planing` (`id`, `date`, `task_id`) VALUES (3, '2016-07-01 05:00', 4);
INSERT INTO `ticket` (`id`, `name`, `type`) VALUES (3, 'ticket_3', 1);
When I perform a select:
SELECT ticket.id AS ticket_id, task.id AS task_id, planing.date as due_date
FROM ticket
LEFT JOIN task ON task.ticket_id = ticket.id
LEFT JOIN planing ON (task.id = planing.task_id AND planing.date IS NOT NULL)
WHERE ticket.type = 1
GROUP BY ticket.id
ORDER BY planing.date ASC
;
on MySQL 5.5 I get
ticket_id | task_id | due_date
1 |1 |null
3 |null |null
2 |3 |2016-06-01 05:00:00
and on MySQL 5.6 I get
ticket_id | task_id | due_date
3 |null |null
1 |2 |2016-06-01 05:00:00
2 |3 |2016-06-01 05:00:00
How to get the same behavior on MySQL 5.5 as on MySQL 5.6 (app server uses the MySQL 5.5 version)?
Best Answer
You just need to stop relying on the non-standard behaviour provided by GROUP BY when the
only_full_group_by
SQL mode is not enabled.Normally, when grouping, you cannot retrieve a column that is not specified in the GROUP BY clause unless you wrap it in an aggregate function. That is because that column may have many different values (remember, we are talking about grouped rows) and there is no way to specify which value of the many should be used in the output. There is a recent extension to that behaviour according to which columns that are functionally dependent on the GROUP BY columns may be selected without specifying them in the GROUP BY, but MySQL has so far went the simple road of offering you just two modes: either you cannot retrieve non-GROUP BY columns without aggregation at all or you can retrieve any such columns, regardless of functional dependence.
So, in this case, you are retrieving columns
task_id
anddue_date
that clearly can have various values perticket_id
. Which value is ultimately selected is out of your control, and that is the problem. The solution is to define for yourself exactly which row from the other tables you want to get when you are grouping byticket.id
and implement the logic the way it can be followed unambiguously.For instance, if you want to go by the behaviour you are observing in MySQL 5.6, you could define it as
Based on that, you could define the logic as follows:
This is certainly much more coding than your original query, but the extra bit of code is not redundant, because this way you are making your results predictable.
The only simpler alternative that might emulate the MySQL 5.6 behaviour could be to rewrite the query like this:
In my tests, the above query returned the same results for 5.5 as the original did for 5.6 for your example. Although the two queries should logically give equivalent results, they are not exactly equivalent: the rewritten query explicitly changes the logical order of joins. Perhaps that prompted MySQL to choose a different path through the data.
But whatever the reason for the different output, please keep in mind that there is no guarantee the behaviour will match for every scenario. I will repeat, you need to stop relying on the behaviour that is not documented, or, actually, documented to be undefined: