Postgresql – Combine various temporal criteria to filter rows

datetimepostgresqltemporal-tables

I am trying to query a database table with some filters. I have a form through which a user will create query attributes:
Form for User
When user will click on search it will pass such values:

 {camera_id: "9879"
, from_date: "2015/05/15"
, to_date: "2016/08/15"
, interval: "120"
, schedule: "{"Monday":["3:0-4:0"],"Tuesday":["1:30-3:30"]
             ,"Wed…":["0:0-3:0"],"Saturday":[],"Sunday":["1:0-3:0"]}"}

interval has one of these names and values:

"All", "0"
"1 Frame Every 1 min", "1"
"1 Frame Every 5 min", "5"
...
"1 Frame Every hour", "60"
...
"1 Frame Every 24 hours", "1440"

For example: "1 Frame every 1 min". If 5 rows in snapshots have been taken within 1 min, the query shall only return the first.

schedule is being sent by using the FullCalendar jquery plugin. Meaning:

Day: Monday,    Time: 03:00 AM to 04:00 AM
Day: Tuesday,   Time: 01:30 AM to 03:30 AM
...

Table snapshots:

  snapshot_id integer
, camera_id   integer
, created_at  timestamp

I want to select rows passing the filters. Temporal filters apply to created_at. I want rows between from_date and to_date, considering the schedule, and only get the first row per interval.

Best Answer

A basic solution to retrieve only the first row per interval would be to

  1. Generate time series using from_date, to_dateand interval with generate_series().
  2. Join to (pre-selected) main table thereby forming 1 group per interval.
  3. Use DISTINCT ON to get the first per group.

For simpler parameter handling I wrapped the query in a plpgsql function:

CREATE OR REPLACE FUNCTION f_snapshots(_camera_id int
                                     , _from_date timestamp
                                     , _to_date   timestamp
                                     , _interval  interval = '0 min')
  RETURNS SETOF snapshots AS
$func$
BEGIN
IF _interval > '0 min' THEN
   RETURN QUERY
   SELECT DISTINCT ON (i) s.* 
   FROM  (
      SELECT *
      FROM   snapshots
      WHERE  created_at >= _from_date
      AND    created_at <  _to_date    -- exclude upper bound?
      ) s
   JOIN   generate_series(_from_date
                        , _to_date - _interval  -- adjust upper bound
                        , _interval) i ON s.created_at >= i
                                      AND s.created_at <  i + _interval
   ORDER  BY i, created_at;

ELSE  -- no interval, return all rows
   RETURN QUERY
   SELECT *
   FROM   snapshots
   WHERE  created_at >= _from_date
   AND    created_at <  _to_date
   ORDER  BY created_at;
END IF;
END
$func$  LANGUAGE plpgsql STABLE;

Call:

SELECT * FROM f_snapshots(
        _camera_id := '9879'
      , _from_date := '2015-05-15 00:00'
      , _to_date   := '2016-08-15 00:00'
      , _interval  := '120 min');

I ignored the additional filter schedule. Add that yourself. I would extract all intervals from the json document and add OR'ed WHERE conditions. Either use the SQL OVERLAPS operator or the overlap operator && for range types. More:

Also consider my answer to your previous question: