No, there is no documentation from Microsoft guaranteeing the behavior, therefore it is not guaranteed.
Additionally, assuming that the Simple Talk article is correct, and that the Concatenation physical operator always processes inputs in the order shown in the plan (very likely to be true), then without a guarantee that SQL Server will always generate plans that keep the same the order between the query text and the query plan, you're only slightly better off.
We can investigate this further though. If the query optimizer was able to reorder the Concatenation operator input, there should exist rows in the undocumented DMV, sys.dm_exec_query_transformation_stats
corresponding to that optimization.
SELECT * FROM sys.dm_exec_query_transformation_stats
WHERE name LIKE '%CON%' OR name LIKE '%UNIA%'
On SQL Server 2012 Enterprise Edition, this produces 24 rows. Ignoring the false matches for transformations related to constants, there is one transformation related to the Concatenation Physical Operator UNIAtoCON
(Union All to Concatenation). So, at the physical operator level, it appears that once a concatenation operator is selected, it will be processed in the order of the logical Union All operator it was derived from.
In fact that is not quite true. Post-optimization rewrites exist that can reorder the inputs to a physical Concatenation operator after cost-based optimization has completed. One example occurs when the Concatenation is subject to a row goal (so it may be important to read from the cheaper input first). See UNION ALL
Optimization by Paul White for more details.
That late physical rewrite was functional up to and including SQL Server 2008 R2, but a regression meant it no longer applied to SQL Server 2012 and later. A fix has been issued that reinstates this rewrite for SQL Server 2014 and later (not 2012) with query optimizer hotfixes enabled (e.g. trace flag 4199).
But about the Logical Union All operator (UNIA
)? There is a UNIAReorderInputs
transformation, which can reorder the inputs. There are also two physical operators that can be used to implement a logical Union All, UNIAtoCON
and UNIAtoMERGE
(Union All to Merge Union).
Therefore it appears that the query optimizer can reorder the inputs for a UNION ALL
; however, it doesn't appear to be a common transformation (zero uses of UNIAReorderInputs
on the SQL Servers I have readily accessible. We don't know the circumstances that would make the optimizer use UNIAReorderInputs
; though it is certainly used when a plan guide or use plan hint is used to force a plan generated using the row goal physical reordered inputs mentioned above.
Is there a way to have the engine process more than one input at a time?
The Concatenation physical operator can exist within a parallel section of a plan. With some difficulty, I was able to produce a plan with parallel concatenations using the following query:
SELECT userid, regdate FROM ( --Users table is around 3mil rows
SELECT userid, RegDate FROM users WHERE userid > 1000000
UNION
SELECT userid, RegDate FROM users WHERE userid < 1000000
UNION all
SELECT userid, RegDate FROM users WHERE userid < 2000000
) d ORDER BY RegDate OPTION (RECOMPILE)
So, in the strictest sense, the physical Concatenation operator does seem to always process inputs in a consistent fashion (top one first, bottom second); however, the optimizer could switch the order of the inputs before choosing the physical operator, or use a Merge union instead of a Concatenation.
Window Functions, or the OVER() Clause
It appears that what you are looking for is a window function or an OVER()
clause.
In your original example, you are trying to use two max()
conditions, which doesn't work, because when you try to then apply GROUP BY
, you can have a condition where the max of the first column ID
and the second column startdate
aren't in the same row, and so then GROUP BY
simply can't be understood.
However, if you are looking to extract the max over a grouping, and you can define that grouping, and you want to obtain the 'column-wise' maximum for more than one column for that grouping (as you seem to want to do), then here's the solution.
CREATE TABLE test (
jobcode integer,
id integer,
a_startdate datetime,
a_enddate datetime,
b_startdate datetime,
b_enddate datetime);
INSERT INTO test VALUES (513801, 7136, '11-01-2011', '12-31-9998', '11-01-2011', '12-31-9998');
INSERT INTO test VALUES (513801, 7137,'04-26-2014', '12-31-9998','04-26-2014', '12-31-9998');
I first made this table to recreate your input data in the picture. I then applied this query.
SELECT jobcode, max(id) OVER (PARTITION BY jobcode),
max(a_startdate) OVER(PARTITION BY jobcode),
a_enddate, b_startdate, b_enddate FROM test;
Try out this SQL Fiddle, and see if it gives the results you're looking for. I did the best I could with the description I had.
Best Answer
Take a look at the plans. When you use
SELECT *
it probably uses the clustered index, and when you only want one column, maybe there is a skinnier index to use.Don't "expect" a certain order. If you don't tell SQL Server how to order, then it will use the most efficient way possible, and this can change due to probably more than 20 factors.
If you want a certain order, SAY SO. Please read #3 here:
Also, this post by Michael Swart may be an interesting read:
If you want your second query to be more efficient, you can consider creating an index on
DateCreated
(you may want to includeOrderID
- not sure of the present index structure).For your actual goal of deleting
n
rows at a time, oldest first, and assumingOrderID
is anIDENTITY
column (so order create date should roughly align with that), why not use this approach (based on this great blog post, also by Michael Swart):In order to minimize the impact on the log, you may want to add some additional logic in there, from my blog post Break large delete operations into chunks. As for the
dbo
prefix, see Bad habits to kick : Avoiding the schema prefix.