Postgresql – Create window (PARTITION BY?) out of column values

postgresql

I have a table structured as:

| date       | key | value |
|------------|-----|-------|
| 2019-01-02 |S1   |20     |
| 2019-02-04 |S1   |30     |
| 2019-03-10 |S2   |15     |
| 2019-04-07 |S1   |0      |
| 2019-04-13 |S2   |35     |
| 2019-04-19 |S1   |10     |
| 2019-05-01 |S1   |30     |
| 2019-05-15 |S1   |40     |
| 2019-06-21 |S1   |0      |

I want to retrieve the date of the first record associated with each key and date, when the given key is 0. If there are multiple 0 associated with each key, I want to partition by 0 and compute that for each partition.

Using the table above as an example, below is the result for S1:

The first partition is as follows:

| 2019-01-02 |S1   |20     | <- Output this
| 2019-02-04 |S1   |30     |
| 2019-04-07 |S1   |0      | <- And this

and it's output should be

| date       | date_of_zero |
|------------|--------------|
| 2019-01-02 | 2019-04-07   |

The second partition will be

| 2019-04-19 |S1   |10     | <- Output this 
| 2019-05-01 |S1   |30     |
| 2019-05-15 |S1   |40     |
| 2019-06-21 |S1   |0      | <- And this

and it's output will be

| date       | date_of_zero |
|------------|--------------|
| 2019-04-19 | 2019-06-21   |

The overall result expected:

| key | date       | date_of_zero |
|-----|------------|--------------|
| s1  | 2019-01-02 | 2019-04-07   |
| s1  | 2019-04-19 | 2019-06-21   |

I have tried to come up with solutions using PARTITION BY and LATERAL JOIN but I do not even know how to get started, and partition by value = 0 as WHERE clause do not work with partition expressions.


I am wondering if this is something solvable in SQL (within reasonable query complexity) or one is better off fetching the rows and doing windowing at the application layer?

Best Answer

WITH cte AS (
SELECT *, COUNT(CASE WHEN v=0 THEN 0 END) OVER (PARTITION BY k ORDER BY d DESC) r
FROM test
)
SELECT k AS key, MIN(d) AS "date", MAX(d) AS date_of_zero
FROM cte
WHERE r > 0
GROUP BY k, r
ORDER BY 1, 2

fiddle

If there are 2 adjacent "zeros" then 2nd record (by date) forms a separate group. The same for a "zero" with the most first date.

If there can be "zero" and "non-zero" with the same date you must decide what is their priority.

The final records are not listed if the most last record is "not zero".