PostgreSQL – SUM Time Interval and Group by 30 Minutes

postgresqlpostgresql-9.2timestamp

I need to SUM Time of this Interval and Group by 30 and 30 minutes.

CREATE TABLE foo AS
SELECT date_start::timestamp,
  date_end::timestamp,
  agent
FROM ( VALUES
  ('2017-01-06 06:52','2017-01-06 06:52',164),
  ('2017-01-06 06:56','2017-01-06 07:25',164),
  ('2017-01-06 06:56','2017-01-06 07:05',169),
  ('2017-01-06 06:56','2017-01-06 10:29',142),
  ('2017-01-06 06:57','2017-01-06 07:23',116)
) AS t(date_start,date_end,agent);

like this example: need to group by 30 and 30 minutes and sum real dates.

    period          |   sum_times
---------------------------------
06:30:00 - 07:00:00 |   00:12:36
07:00:00 - 07:30:00 |   01:24:15
07:30:00 - 08:00:00 |   0:30:00
08:00:00 - 08:30:00 |   0:30:00
08:30:00 - 09:00:00 |   0:30:00
09:00:00 - 09:30:00 |   0:30:00
09:30:00 - 10:00:00 |   0:30:00
10:00:00 - 10:30:00 |   0:29:40

Sql code:

WITH min_max_time AS (
  SELECT MIN(datahora_ini::timestamp(0) ), MAX(datahora_fim::timestamp(0) )
  FROM callcenter.agente_login
  WHERE  datahora_ini >= '2017-01-06 00:00:00' AND datahora_ini <= '2017-01-06 23:59:59'
), periods(time) AS (
  SELECT generate_series(min, max, '30 minutes'::interval)
  FROM min_max_time
), w_agentes_logados AS (
SELECT
  CONCAT(to_char((time),'YYYYMMDD'), to_char(to_timestamp(floor((extract('epoch' from time) / 1800 )) * 1800) AT TIME ZONE 'UTC','HH24MI') ) as id_chamadas_receptiva,
  SUM(CASE WHEN datahora_ini::timestamp(0)  <= time AND datahora_fim::timestamp(0)  >= time THEN EXTRACT(EPOCH FROM (datahora_fim - datahora_ini)) else 0 end) as quantidade

FROM callcenter.agente_login, periods
GROUP BY time
)

SELECT * FROM w_agentes_logados

expected result, but it's wrong it's not returning the real result.

time_id           | qty (seconds)
----------------------------------
201701060630          | 9   
201701060700          | 103418  
201701060730          | 112410  
201701060800          | 137369  
201701060830          | 279747  
201701060900          | 370753  
201701060930          | 372496  

Best Answer

I've got a ton of questions here,

  1. Are these all the same day?
  2. What should happen if there is no overlap in time?

Let's also focus on this table from the times of 6:30-7:00

     date_start      |      date_end       | overlap  
---------------------+---------------------+----------
 2017-01-06 06:52:00 | 2017-01-06 06:52:00 | 
 2017-01-06 06:56:00 | 2017-01-06 07:25:00 | 00:04:00
 2017-01-06 06:56:00 | 2017-01-06 07:05:00 | 00:04:00
 2017-01-06 06:56:00 | 2017-01-06 10:29:00 | 00:04:00
 2017-01-06 06:57:00 | 2017-01-06 07:23:00 | 00:03:00

So how do you get this in your ideal result set?

06:30:00 - 07:00:00 |   00:12:36

Please, value our time... When you create sample data, it's important for you to be sure you're not misleading us. I'm sure it's an innocent mistake, but double check next time. Moreover, this should set you down the right path to get this job done..

SELECT
  tsrange,
  sum(
    UPPER(tsrange*tsrange(date_start,date_end)) -
    LOWER(tsrange*tsrange(date_start,date_end))
  )
FROM foo
JOIN (
  SELECT tsrange(ts::timestamp, ts+'30 minutes'::interval) AS tsrange
  FROM generate_series(
    '2017-01-06'::timestamp,
    '2017-01-17'::timestamp,
    '30 minutes'::interval
  ) AS t(ts)
) AS t(tsrange)
  ON tsrange && tsrange(date_start,date_end)
GROUP BY tsrange
ORDER BY tsrange
;

Let's break it down, here we generate the times in 30 minute intervals as a timestamp range..

JOIN (
  SELECT tsrange(ts::timestamp, ts+'30 minutes'::interval) AS tsrange
  FROM generate_series(
    '2017-01-06'::timestamp,
    '2017-01-17'::timestamp,
    '30 minutes'::interval
  ) AS t(ts)
) AS t(tsrange)
  ON tsrange && tsrange(date_start,date_end)

Then we join this set on an overlap test. The only other clever unintuitive thing is this,

sum(
  UPPER(tsrange*tsrange(date_start,date_end)) -
  LOWER(tsrange*tsrange(date_start,date_end))
)

In my opinion tsrange - tsrange should return an interval (or a cast should be provided). One isn't. So we have to do what's done above. This provides us the duration of the overlap that we previously detected with &&. We can sum(interval) and produce the result set you want..

                    tsrange                    |   sum    
-----------------------------------------------+----------
 ["2017-01-06 06:30:00","2017-01-06 07:00:00") | 00:15:00
 ["2017-01-06 07:00:00","2017-01-06 07:30:00") | 01:23:00
 ["2017-01-06 07:30:00","2017-01-06 08:00:00") | 00:30:00
 ["2017-01-06 08:00:00","2017-01-06 08:30:00") | 00:30:00
 ["2017-01-06 08:30:00","2017-01-06 09:00:00") | 00:30:00
 ["2017-01-06 09:00:00","2017-01-06 09:30:00") | 00:30:00
 ["2017-01-06 09:30:00","2017-01-06 10:00:00") | 00:30:00
 ["2017-01-06 10:00:00","2017-01-06 10:30:00") | 00:29:00
(8 rows)