Sql-server – the default ‘window’ an aggregate function is applied to

sql serverwindow functions

The SQL Server documentation on the OVER clause HERE, mentions that the OVER clause:

Determines the partitioning and ordering of a rowset before the associated window function is applied

By this I assume that aggregate functions are 'window functions'? While I understand what a window is (or at least how it works) when using the OVER clause, what is the default window for an aggregation function when no OVER clause is used?

i.e. for the SQL statement:

SELECT
  someCol,
  someCol2,
  AVG(someCol3) [avg]
FROM
  tbl
GROUP BY
  someCol,
  someCol2

What is the Window that the SUM function is being applied to?

Is it the grouped table prior to the SUM function being applied? In this case is it correct to assume that all window functions are the last operations to be applied to a result set? Since FROM / WHERE / GROUP BY / HAVING / SELECT / ORDER BY must all have been worked out to produce the final window on which to apply aggregation functions?

====================

In response to the answer below by Max Vernon:

If I write this query:

SELECT
    CP.iYear,
    AVG(CP.mUpgradeCost) OVER(PARTITION BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP

I get the results you refer to, with a row for each year in CapitalProject, and since there are many repeated years I get the same average for each year. I.e.

+-------+------------+
| iYear | avg        |
+-------+------------+
| 0     | 18636.3636 |
+-------+------------+
| 0     | 18636.3636 |
+-------+------------+
| 1     | 46741.5151 |
+-------+------------+
| 1     | 46741.5151 |
+-------+------------+

I then changed the SQL to group on years so that I got a single line per year. the SQL is:

SELECT
    CP.iYear,
    AVG(AVG(CP.mUpgradeCost)) OVER(PARTITION BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP
GROUP BY CP.iYear

HOWEVER, when I altered the function to this:

SELECT
    CP.iYear,
    SUM(AVG(CP.mUpgradeCost)) OVER(PARTITION BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP
GROUP BY CP.iYear

I was expecting the averages for each row over the partitioned table to be added together. But they didn't. Why is that?

Clearly the window that the SUM is being applied to is not a nested query as i thought it might be

Best Answer

I don't think you understand what's going on from your interactions on another question which you linked here.

Search for this line in the question: AVG(AVG(CP.mUpgradeCost)) OVER(PARTITION BY iYear) AS [avg]. If I change that to SUM(AVG(CP.mUpgradeCost)) OVER(PARTITION BY iYear) AS [sum], I get the same value

So let's clarify a few things.

Namely

  1. Empty OVER ()
  2. SQL evaluation

Empty OVER()

what is the default window for an aggregation function when no OVER clause is used?

You require an OVER clause to have a window. And, as far as I know all sql variants require at least, OVER (). So the question, what does an OVER () do..

Well, a window is composed, per the 2011 spec as

  • The window name.
  • Optionally, the ordering window name—that is, the name of another window, called the ordering window, that is used to define the partitioning and ordering of the present window.
  • The window partitioning clause—that is, a if any is specified in either the present or in the window descriptor of the ordering window.
  • The window ordering clause—that is, a if any is specified in either the present or in the window descriptor of the ordering window.
  • The window framing clause—that is, a , if any.

All of these have defaults.

  • The order defaults to undefined
  • The partition defaults to the entire query
  • The name defaults to something undefined (platform-dependent) implicitly.

SQL Evaluation of GROUP BY and WINDOW

This shows the sum, avg, and count of three windows. Here you'll notice we have one window (by definition) over which each function operates, that window adheres to the defaults listed above: sum, avg, and count each show only one result because they're operating over unbounded unpartitioned result-sets.

SELECT x, sum(x) OVER (), avg(x) OVER (), count(x) OVER ()
FROM ( VALUES (1),(2),(3) ) AS t(x);

 x | sum |        avg         | count 
---+-----+--------------------+-------
 1 |   6 | 2.0000000000000000 |     3
 2 |   6 | 2.0000000000000000 |     3
 3 |   6 | 2.0000000000000000 |     3
(3 rows)

Here we create three groups, with one row in each group. We show the sum, avg, and count of those groups. Because each group previously had only one row, our result set is still three rows. GROUP BY didn't do much, but you'll notice that the aggregates only run over the group.

SELECT x, sum(x), avg(x), count(x)
FROM ( VALUES (1),(2),(3) ) AS t(x)
GROUP BY x;

 x | sum |          avg           | count 
---+-----+------------------------+-------
 1 |   1 | 1.00000000000000000000 |     1
 2 |   2 |     2.0000000000000000 |     1
 3 |   3 |     3.0000000000000000 |     1
(3 rows)

Here is where I think you're confused...

SELECT x, sum(sum(x)) OVER (), avg(avg(x)) OVER (), count(count(x)) OVER ()
FROM ( VALUES (1),(2),(3) ) AS t(x)
GROUP BY x;

 x | sum |          avg           | count 
---+-----+------------------------+-------
 1 |   6 | 2.00000000000000000000 |     3
 2 |   6 | 2.00000000000000000000 |     3
 3 |   6 | 2.00000000000000000000 |     3
(3 rows)

When we GROUP BY (x) with a window function, we add more complexity. In the above we still get the same result set, but we calculate first the sum, avg, and count over a GROUP with one row. Look at the query right above this one. Then after that runs, we calculate the window functions sum (1+2+3), avg (1+2+3)/2, and count 1+1+1 over a window which includes all the aggregates above. sum(sum(x)) OVER () finds the sum(x) for the group which is one row, and then after that's done processing the sum() OVER () for the window.