(Indexed views aside, of course.)
A view is not materialized - the data isn't stored, so how could it be sorted? A view is kind of like a stored procedure that just contains a SELECT
with no parameters... it doesn't hold data, it just holds the definition of the query. Since different references to the view could need data sorted in different ways, the way that you do this - just like selecting from a table, which is also an unsorted collection of rows, by definition - is to include the order by on the outer query.
Also to give a little insight into the history. You could never put ORDER BY
in a view, without also including TOP
. And in this case the ORDER BY
dictated which rows were included by TOP
, not how they would be presented. It just so happened that in SQL Server 2000, if TOP
was 100 PERCENT
or {some number >= number of rows in the table}
, the optimizer was fairly simplistic and it ended up producing a plan with a sort that matched the TOP/ORDER BY
. But this behavior was never guaranteed or documented - it was just relied upon based on observation, which is a bad habit. When SQL Server 2005 came out, this behavior started "breaking" because of changes in the optimizer that led to different plans and operators being used - among other things, the TOP / ORDER BY
would be ignored completely if it was TOP 100 PERCENT
. Some customers complained about this so loudly that Microsoft issued a trace flag to reinstate the old behavior. I'm not going to tell you what the flag is because I don't want you to use it and I want to make sure that the intent is correct - if you want a predictable sort order, use ORDER BY
on the outer query.
To summarize and just as much to clarify a point you made: Microsoft didn't remove anything. They made the product better, and as a side effect this undocumented, non-guaranteed behavior became less reliable. Overall, I think the product is better for it.
If it is faster you have to measure for yourself. However, doing it on the db side sends less data across, so I would assume it to be faster.
The pivot itself is fairly simple. I put your query's result in a table to make the example simpler.
SQL Fiddle
PostgreSQL 9.1.9 Schema Setup:
CREATE TABLE your_query
("rel_id" int, "timestamp" timestamp, "y" varchar(1))
;
INSERT INTO your_query
("rel_id", "timestamp", "y")
VALUES
(1, '2013-01-01 00:00:00', 'a'),
(1, '2013-01-02 00:00:00', 'b'),
(1, '2013-01-03 00:00:00', 'c'),
(1, '2013-01-04 00:00:00', 'd'),
(2, '2013-01-01 00:00:00', 'e'),
(2, '2013-01-04 00:00:00', 'f'),
(2, '2013-01-06 00:00:00', 'g')
;
First step is to only return one row per date. That is simply done with a group by
:
Query 1:
SELECT timestamp
FROM your_query
GROUP BY timestamp
ORDER BY timestamp
Results:
| TIMESTAMP |
|--------------------------------|
| January, 01 2013 00:00:00+0000 |
| January, 02 2013 00:00:00+0000 |
| January, 03 2013 00:00:00+0000 |
| January, 04 2013 00:00:00+0000 |
| January, 06 2013 00:00:00+0000 |
Now wee need to pull the "correct" value into each column. For that we combine an aggregate with a case. The case
returns null for all rows for which the condition is not met. the aggregate ignores nulls. That leaves the one value we are looking for:
Query 2:
SELECT timestamp,
MAX(CASE WHEN rel_id = 1 THEN y END ) AS "1",
MAX(CASE WHEN rel_id = 2 THEN y END ) AS "2"
FROM your_query
GROUP BY timestamp
ORDER BY timestamp
Results:
| TIMESTAMP | 1 | 2 |
|--------------------------------|--------|--------|
| January, 01 2013 00:00:00+0000 | a | e |
| January, 02 2013 00:00:00+0000 | b | (null) |
| January, 03 2013 00:00:00+0000 | c | (null) |
| January, 04 2013 00:00:00+0000 | d | f |
| January, 06 2013 00:00:00+0000 | (null) | g |
To make this work with your original query just replace your_query
in my example with
(
(SELECT rel_id, timestmap, y FROM table_1 AS full_
WHERE full_.timestamp BETWEEN %s AND %s
ORDER BY full_.timestamp)
UNION ALL
(SELECT rel_id, timestamp, y FROM table_2 AS full_
WHERE full_.timestamp BETWEEN %s AND %s
ORDER BY full_.timestamp)
UNION ALL
...
) AS your_query
Best Answer
As you haven't defined a different collation for your column in question, it uses the database-wide one, which is
en_US.UTF8
- just like on my test box. I observe the exact same behaviour, take it as a consolation :)What we see is apparently a case of the variable collation elements. Depending on the character and the collation, a number of different behaviours is possible. Here the underscore (and the hyphen and some others, too) are used only for breaking ties - 'a' and '_a' are equivalent in the first round, then the tie between them is resolved by taking the underscore into account.
If you want to sort with ignoring the underscores (and hyphens, question marks and exclamation marks in my example), you can define an ordering on an expression:
In my experiments adding a new value to the list often changes the order between otherwise equal items, showing they are treated really equal.