SQL Server: Average of Last 4 Runs Using Window Functions

aggregatesql serversql-server-2012window functions

I need to show the average weight of a product over its last 4 production runs. I'm not sure how best to describe it other than an example: Let's imagine I have the following table that lists a product by date it was created, and the average weight of the product for that day:

+---------+---------+--------+
| Product |  Date   | Weight |
+---------+---------+--------+
|  900000 | Jan 1   | 20.0   |
|  900000 | March 3 | 12.2   |
|  900000 | July 6  | 15.0   |
|  900000 | July 7  | 14.0   |
|  900000 | Aug 6   | 3.0    |
|  800000 | June 2  | 14.0   |
|  800000 | June 3  | 12.0   |
+---------+---------+--------+

The end result I'm looking for is to add a column that includes the average Weight for the last 4 dates the product was run for, so something like this:

+---------+---------+--------+----------------+
| Product |  Date   | Weight | Average Weight |
+---------+---------+--------+----------------+
|  900000 | Jan 1   | 20.0   | NULL           |
|  900000 | March 3 | 12.2   | NULL           |
|  900000 | July 6  | 15.0   | NULL           |
|  900000 | July 7  | 14.0   | NULL           |
|  900000 | Aug 6   | 3.0    | 15.3           | Jan1+Mar3+July6+July7/4
|  900000 | Aug 8   | 13.0   | 11.05          | Mar3+July6+July7+Aug6/4
|  800000 | June 2  | 14.0   | NULL           |
|  800000 | June 3  | 12.0   | NULL           |
|  800000 | June 4  | 12.0   | NULL           |
|  800000 | June 5  | 12.0   | NULL           |
|  800000 | June 6  | 12.0   | 12.5           | etc...
+---------+---------+--------+----------------+

The NULL's are just there since in this sample you can't calculate the average over the last 4 runs because the data isn't there.

Could anyone point me in the direction I need to be looking to do something like this?

Best Answer

Sample data:

CREATE TABLE dbo.Thing
(
    Product integer NOT NULL,
    TheDate date NOT NULL,
    TheWeight decimal(5, 1) NOT NULL
);

INSERT dbo.Thing
    (Product, TheDate, TheWeight)
VALUES
    (900000, CONVERT(date, '20160101', 112), 20.0),
    (900000, '20160303', 12.2),
    (900000, '20160706', 15.0),
    (900000, '20160707', 14.0),
    (900000, '20160806', 3.0 ),
    (900000, '20160808', 13.0 ),
    (800000, '20160602', 14.0),
    (800000, '20160603', 12.0),
    (800000, '20160604', 12.0),
    (800000, '20160605', 12.0),
    (800000, '20160606', 12.0);

Solution:

The general idea here is to use the extended window aggregate functions available in SQL Server 2012 and later.

The only wrinkle is that AVG does not return null over a window if it is smaller than the required four rows. To address that, we also calculate the number of rows found in the window using COUNT. A simple CASE expression can then be used to return a null if the window holds fewer than four rows:

SELECT
    T.Product,
    T.TheDate,
    T.TheWeight,
    [Average Weight] =
        CASE
            WHEN
                4 > COUNT_BIG(*) OVER (
                    PARTITION BY T.Product
                    ORDER BY T.Product, T.TheDate
                    ROWS BETWEEN 4 PRECEDING
                    AND 1 PRECEDING
                    )
                THEN NULL
            ELSE
                AVG(T.TheWeight) OVER (
                    PARTITION BY T.Product
                    ORDER BY T.Product, T.TheDate
                    ROWS BETWEEN 4 PRECEDING
                    AND 1 PRECEDING
                    )
        END
FROM dbo.Thing AS T
ORDER BY
    T.Product,
    T.TheDate;

Run the query on Stack Exchange Data Explorer

Output:

Results

Execution plan

More information:

Window Functions in SQL Server

Related question:

Date range rolling sum using window functions