Mysql – Is it REALLY possible that the order will not be guaranteed for this particular redundant derived table

derived-tablesMySQLoracleorder-bypostgresql

I stumbled upon this question on a Twitter conversation with Lukas Eder.

Although the correct behavior would be to apply the ORDER BY clause on the outermost query, because, here, we are not using DISTINCT, GROUP BY, JOIN or any other WHERE clause in the outermost query, why wouldn't a RDBMS just pass the incoming data as it was sorted by the inner query?

SELECT * 
FROM (
    SELECT * FROM table ORDER BY time DESC
) AS t

When running this example on PostgreSQL, at least, you get the same Execution Plan for both the inner query and this derived table example, as well as the same result set.

So, I would assume that the Planner will simply discard the outermost query because it's redundant or simply pass through the results from the inner table.

Does anyone think this might not be the case?

Best Answer

Most databases are quite clear about the fact that an ORDER BY in a subquery is either:

  • Not allowed: E.g. SQL Server, Sybase SQL Anywhere (unless complemented with TOP or OFFSET .. FETCH)
  • Meaningless: E.g. PostgreSQL, DB2 (again, unless complemented with OFFSET .. FETCH or LIMIT)

Here's an example from the DB2 LUW manual (emphasis mine)

An ORDER BY clause in a subselect does not affect the order of the rows returned by a query. An ORDER BY clause only affects the order of the rows returned if it is specified in the outermost fullselect.

The wording is quite explicit, just like PostgreSQL's:

If sorting is not chosen, the rows will be returned in an unspecified order. The actual order in that case will depend on the scan and join plan types and the order on disk, but it must not be relied on. A particular output ordering can only be guaranteed if the sort step is explicitly chosen.

From this specification, it can be followed that any ordering resulting from the ORDER BY clause in a derived table is merely accidental and may coincidentally match your expected ordering (which it does in most databases in your trivial example), but it would be unwise to rely on this.

Side note on DB2:

In particular, DB2 has a lesser known feature called ORDER BY ORDER OF <table-designator>, which can be used as follows:

SELECT C1 FROM
   (SELECT C1 FROM T1
      UNION
    SELECT C1 FROM T2
    ORDER BY C1 ) AS UTABLE
ORDER BY ORDER OF UTABLE

In this particular case, the ordering of the derived table can be explicitly re-used in the outer most SELECT

Side note on Oracle:

For years it has been a practice in Oracle to implement OFFSET pagination using ROWNUM, which can be reasonably calculated only after ordering a derived table:

SELECT *
FROM (
  SELECT rownum AS rn, t.* -- ROWNUM here depends on the derived table's ordering
  FROM (
    SELECT * FROM table ORDER BY time DESC
  ) t
) t
WHERE rn BETWEEN 10 AND 20

It can be reasonably expected that at least in the presence of ROWNUM in a query, future Oracle versions will not break this behaviour in order not to break pretty much all the legacy Oracle SQL out there, which has not yet migrated to the much more desireable and readable SQL standard OFFSET .. FETCH syntax:

SELECT * FROM table ORDER BY time DESC OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY