Sql-server – SQL Server: Order of Rows

group byorder-byselectsql server

We all know that a simple statement such as:

SELECT * FROM stuff;

should not produce an ordered result. However, when I try to demonstrate this, it always comes out in primary key order.

Moreover a statement such as:

SELECT thing,whatever FROM stuff
GROUP BY thing,whatever;

always appears to sort thing by the last field in the GROUP BY clause, which is not at all helpful.

The question is, under what circumstances does SQL SERVER order the result even if unasked, and what can I do to stop this?

I am trying to demonstrate to my students that order is indeterminate unless specified, but this is not helping my case.

I admit that I am working with a small set of sample data.

Thanks

Best Answer

under what circumstances does SQL SERVER order the result even if unasked?

This might happen if the access path to retrieve the data happens to produce data already ordered e.g. an index scan in key order. Or if the plan contains an explicit sort in order to provide data in order for operators that require this, such as merge join or stream aggregate. Without an explicit order by no order is guaranteed but SQL Server won't go out of its way to prevent results being output in order either.

and what can I do to stop this?

For the first query you can see different ordering if SQL Server uses an allocation order scan.

So make sure the table is at least 64 pages in size and that the allocation order isn't the same as key order then run the query at read uncommitted isolation level.

CREATE TABLE T
(
X INT IDENTITY PRIMARY KEY,
Y CHAR(4000)
);

INSERT INTO T
SELECT TOP 100 'A'
FROM master..spt_values;

/* Cause page splits so key order and allocation order differ
   and leaves one row per page so table is now > 64 pages*/
ALTER TABLE T ALTER COLUMN Y CHAR(4001);

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT *
FROM T;

The read uncommitted isn't required for the above in the case that a table lock is taken, there is a hint for that but a less obvious way would be to change the above table definition to

CREATE TABLE T
(
X INT IDENTITY PRIMARY KEY WITH(allow_row_locks = off, allow_page_locks = off),
Y CHAR(4000)
);

And then hopefully you will see disordered results without having to change the isolation level.

For the second query you want a hash aggregate rather than a stream aggregate. You can force this with a query hint or less obviously with a plan guide but this is more likely to be chosen organically if there are relatively few distinct groups compared to the size of the table.