Postgresql – Get durations from time series table where some condition holds consecutively

postgresqltimestampwindow functions

Given a table where each row has a timestamp and some measurement, how can I get all of the intervals during which the measurement consecutively satisfies some condition?

Here's a SQLfiddle (not sure if there's a better site for that nowadays).

So in this case, I'd like to to get all of the durations during which status is true. The starting point should be the first true point chronologically. The end point can be the first false point or the last true point, whichever is easier.

I looked to window functions first. I could come up with separate queries, one that obtained the starting points and one that produced the end points, but it felt like I was going about it the wrong way.

Best Answer

You could try along:

SELECT
  ts
  , CASE status
      WHEN false THEN 'end'
      WHEN true THEN 'start'
      ELSE 'ERROR'
    END AS mode
FROM (
  SELECT
    ts
    , status
    , LAG(status, 1) OVER (ORDER BY ts) AS lg
  FROM log
) AS T
WHERE
  (lg IS NULL)
  OR
  (status <> lg)
ORDER BY ts
;

The sub-select is necessary as window functions are not allowed directly in the WHERE clause.

See it in action: SQL Fiddle (Thanks for preparing the fiddle! ;-) )

Update

To get the duration actually listed, try

WITH
S1 AS (
  SELECT
    ts
    , status
    , LAG(status, 1) OVER (ORDER BY ts) AS lg
    , CASE status WHEN LAG(status, 1) OVER (ORDER BY ts) THEN 0 ELSE 1 END AS toggle 
  FROM log
  ORDER BY ts
),
S2 AS (
  SELECT
    ts
    , status
    , lg
    , SUM(toggle) OVER (ORDER BY ts) AS streak
  FROM S1
  ORDER BY ts
),
S3 (ts, last_ts, status, lg, streak) AS (
  SELECT
    ts
    , LAST_VALUE(ts) OVER (PARTITION BY streak
                           ORDER BY ts
                           RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
                          )
    , status
    , lg
    , streak
  FROM S2
  ORDER BY ts
),
S4 (start, duration) AS (
  SELECT
    ts
    , last_ts - ts
  FROM S3
  WHERE
    status
    AND ((status <> lg) OR (lg IS NULL))
  ORDER BY ts
),
S5 (ts, status, duration) AS (
  SELECT
    ts
    , status
    , LEAD(ts, 1, (SELECT MAX(ts) FROM S3)) OVER (ORDER BY ts) - ts
  FROM S3
  WHERE
    (status <> lg)
    OR
    (lg IS NULL)
  ORDER BY ts
),
S6 (start, duration) AS (
  SELECT
    ts
    , duration
  FROM S5
  WHERE status
)
SELECT
--  * FROM S4
  * FROM S6
;

See it in action: SQL Fiddle

It starts off by identifying the consecutive ranges. In S4, the ranges end when status is TRUE for the last time respectively; in S6 when status changed to FALSE.

This could probably be condensed a little bit. But as window functions cannot be nested (directly) anyway…

Please comment, if and as this requires adjustment / further detail.