SQL Server Window Functions – Conditional Window Function with a Twist

sql serverwindow functions

Asking for a friend who uses SQL Server Parallel Warehouse.

They™ has a table of weekly sales amounts, like so (please forgive improper column justification):

+-------+------------+
| week  |  amount    |
+-------+------------+
|    1  | 100.00     |
|    2  | 100.00     |
|    3  | 100.00     |
|    4  | 100.00     |
|    5  | 100000.00  |
|    6  | 100.00     |
|    7  | 50000.00   |
|    8  | 50000.00   |
|    9  | 50000.00   |
|   10  | 100.00     |
+-------+------------+

And also a list of "bad" weeks, e.g.

+------+
| week |
+------+
|    5 |
|    7 |
|    8 |
|    9 |
+------+

And They™ needs to select for each week, including "bad" weeks, the sum of sales for the four preceding not "bad" weeks, i.e. going as far back as possible, skipping "bad" week records, to add up at most four sales amounts. So the expected result would be:

+-------+------------+
| week  | sum_not_bad|
+-------+------------+
|    1  | null       |
|    2  | 100.00     |
|    3  | 200.00     |
|    4  | 300.00     |
|    5  | 400.00     |
|    6  | 400.00     |
|    7  | 400.00     |
|    8  | 400.00     |
|    9  | 400.00     |
|   10  | 400.00     |
+-------+------------+

I have a fiddle that has I think one step in the right direction but I can't figure out the next step(s).

Does anyone have insights?

Best Answer

Here's one way, it uses PARTITION to group all the good weeks together and get the cumulative good count for the preceding 4 good weeks. Then an approach along the lines of the Solution 2 Using Concatenation here to work around the lack of support for LAST_VALUE ignoring NULLs and cascade down the previous "good" value.

It keeps track of two cumulative sums. One including the current row (used by the next row if the "next" row is bad) and one without the current row.

WITH T
     AS (SELECT d.week,
                d.amount,
                CASE WHEN b.week IS NULL THEN 0 ELSE 1 END AS is_bad_week,
                SUM(CASE WHEN b.week IS NULL THEN d.amount END)
                  OVER ( PARTITION BY CASE WHEN b.week IS NULL THEN 0 ELSE 1 END ORDER BY d.week rows BETWEEN 4 PRECEDING AND 1 PRECEDING) cume_sum_prev4toprev1,
                SUM(CASE WHEN b.week IS NULL THEN d.amount END)
                  OVER ( PARTITION BY CASE WHEN b.week IS NULL THEN 0 ELSE 1 END ORDER BY d.week rows BETWEEN 3 PRECEDING AND CURRENT ROW) cume_sum_prev3tocurrent
         FROM   data d
                LEFT JOIN bad_weeks b
                       ON d.week = b.week)
SELECT week,
       CASE WHEN is_bad_week = 1 THEN 
       CAST(SUBSTRING(MAX(RIGHT(CONCAT('0000000000', week), 10) + CAST(cume_sum_prev3tocurrent AS VARCHAR(20))) OVER (ORDER BY week), 11, 20) AS DECIMAL(20, 2)) 
       ELSE
       CAST(SUBSTRING(MAX(RIGHT(CONCAT('0000000000', week), 10) + CAST(cume_sum_prev4toprev1 AS VARCHAR(20))) OVER (ORDER BY week), 11, 20) AS DECIMAL(20, 2))
       END AS sum_not_bad
FROM   T
ORDER  BY week