The SQL Server documentation on the OVER
clause HERE, mentions that the OVER
clause:
Determines the partitioning and ordering of a rowset before the associated window function is applied
By this I assume that aggregate functions are 'window functions'? While I understand what a window is (or at least how it works) when using the OVER
clause, what is the default window for an aggregation function when no OVER
clause is used?
i.e. for the SQL statement:
SELECT
someCol,
someCol2,
AVG(someCol3) [avg]
FROM
tbl
GROUP BY
someCol,
someCol2
What is the Window that the SUM function is being applied to?
Is it the grouped table prior to the SUM function being applied? In this case is it correct to assume that all window functions are the last operations to be applied to a result set? Since FROM / WHERE / GROUP BY / HAVING / SELECT / ORDER BY
must all have been worked out to produce the final window on which to apply aggregation functions?
====================
In response to the answer below by Max Vernon:
If I write this query:
SELECT
CP.iYear,
AVG(CP.mUpgradeCost) OVER(PARTITION BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP
I get the results you refer to, with a row for each year in CapitalProject
, and since there are many repeated years I get the same average for each year. I.e.
+-------+------------+
| iYear | avg |
+-------+------------+
| 0 | 18636.3636 |
+-------+------------+
| 0 | 18636.3636 |
+-------+------------+
| 1 | 46741.5151 |
+-------+------------+
| 1 | 46741.5151 |
+-------+------------+
I then changed the SQL to group on years so that I got a single line per year. the SQL is:
SELECT
CP.iYear,
AVG(AVG(CP.mUpgradeCost)) OVER(PARTITION BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP
GROUP BY CP.iYear
HOWEVER, when I altered the function to this:
SELECT
CP.iYear,
SUM(AVG(CP.mUpgradeCost)) OVER(PARTITION BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP
GROUP BY CP.iYear
I was expecting the averages for each row over the partitioned table to be added together. But they didn't. Why is that?
Clearly the window that the SUM is being applied to is not a nested query
as i thought it might be
Best Answer
I don't think you understand what's going on from your interactions on another question which you linked here.
So let's clarify a few things.
Namely
Empty
OVER()
You require an OVER clause to have a window. And, as far as I know all sql variants require at least,
OVER ()
. So the question, what does anOVER ()
do..Well, a window is composed, per the 2011 spec as
All of these have defaults.
SQL Evaluation of
GROUP BY
andWINDOW
This shows the
sum
,avg
, andcount
of three windows. Here you'll notice we have one window (by definition) over which each function operates, that window adheres to the defaults listed above:sum
,avg
, andcount
each show only one result because they're operating over unbounded unpartitioned result-sets.Here we create three groups, with one row in each group. We show the
sum
,avg
, andcount
of those groups. Because each group previously had only one row, our result set is still three rows.GROUP BY
didn't do much, but you'll notice that the aggregates only run over the group.Here is where I think you're confused...
When we
GROUP BY (x)
with a window function, we add more complexity. In the above we still get the same result set, but we calculate first thesum
,avg
, andcount
over aGROUP
with one row. Look at the query right above this one. Then after that runs, we calculate the window functionssum
(1+2+3)
,avg
(1+2+3)/2
, andcount
1+1+1
over a window which includes all the aggregates above.sum(sum(x)) OVER ()
finds thesum(x)
for the group which is one row, and then after that's done processing thesum() OVER ()
for the window.