Mysql – getting different results from a query vs. a view

greatest-n-per-groupgroup byMySQLview

The base problem I'm facing is that I need the latest record for each item.

A little setup… MySQL 5.6.14.

I need to create two views (because MySQL won't let me have a subquery in a view). My first query sets up the data like this.

select 
    `inventoryrecords`.`inventoryrecordid` AS `inventoryrecordid`,
    `inventoryrecords`.`logicaldeviceid` AS `logicaldeviceid`,
    `inventoryrecords`.`passrfid` AS `passrfid`,
    `inventoryrecords`.`tagepc` AS `tagepc`,
    `inventoryrecords`.`currentstate` AS `currentstate`,
    `inventoryrecords`.`statedateutc` AS `statedateutc`,
    `inventoryrecords`.`ownerobjectid` AS `ownerobjectid`,
    `inventoryrecords`.`ownerobjecttype` AS `ownerobjecttype`
from
    `inventoryrecords`
where
    1
order by `inventoryrecords`.`statedateutc` desc

Then I can restrict everything to just the last record for each TagEPC with my "real" query like this.

select 
    `lastinventoryrecords_step1`.`inventoryrecordid` AS `inventoryrecordid`,
    `lastinventoryrecords_step1`.`logicaldeviceid` AS `logicaldeviceid`,
    `lastinventoryrecords_step1`.`passrfid` AS `passrfid`,
    `lastinventoryrecords_step1`.`tagepc` AS `tagepc`,
    `lastinventoryrecords_step1`.`currentstate` AS `currentstate`,
    `lastinventoryrecords_step1`.`statedateutc` AS `statedateutc`,
    `lastinventoryrecords_step1`.`ownerobjectid` AS `ownerobjectid`,
    `lastinventoryrecords_step1`.`ownerobjecttype` AS `ownerobjecttype`
from
    `lastinventoryrecords_step1`
group by `lastinventoryrecords_step1`.`tagepc`
order by `lastinventoryrecords_step1`.`statedateutc` desc

When I try to select * from the "real" view, I do not get the data I expect. However when I run the query with the sub-query in a window I do.

select * from ( 
lastinventoryrecords_step1 )
group by tagepc
order by statedateutc desc

yields incorrect data whereas

select * from (
    select 
        inventoryrecordid,
        logicaldeviceid,
        passrfid,
        tagepc,
        currentstate,
        statedateutc,
        ownerobjectid,
        ownerobjecttype
    from
        inventoryrecords
    where
        1
    order by statedateutc desc ) as t1
group by tagepc
order by statedateutc desc

yields the right data.

What in the heck is going on? What am I missing?

Best Answer

You're using GROUP BY in a non-deterministic way. You GROUP BY tagepc, but what value for the other columns do you expect?

MySQL permits this kind of query, but MySQL chooses a row from the group arbitrarily. In practice, it tends to be the first row in the group, as they are stored physically.

But in the case when you query through the view, the view stores its results in a temporary table, probably in a different physical order than the data in the original table.

Note that http://dev.mysql.com/doc/refman/5.6/en/create-view.html says:

ORDER BY is permitted in a view definition, but it is ignored if you select from a view using a statement that has its own ORDER BY.

So the query run by the view to populate the temp table ignores your view's ORDER BY, and since you used GROUP BY it probably takes values from arbitrary rows in the underlying table.

How to fix this? Stop running ambiguous GROUP BY queries. Any column referenced in your select-list should either by (a) listed in the GROUP BY clause, or (b) inside a grouping function like MAX() or something.

I don't know what you expect this query to do, or why you're using GROUP BY in the first place. So I can't offer any more specific advice.


Re your comment:

Thanks for the clarification. This is a pretty common type of SQL question. I created the tag on StackOverflow and now it has close to a thousand questions. The tag exists on dba.stackexchange.com, but with fewer questions so far.

Here's a solution:

SELECT i.*
FROM inventoryrecords AS i
JOIN (SELECT TagEPC, MAX(StateDateUTC) AS StateDateUTC
      FROM inventoryrecords
      GROUP BY TagEPC ORDER BY NULL) AS maxi USING (TagEPC, StateDateUTC);