I need to calculate the total duration of events in a given time interval and on a given day of the week with PostgreSQL 11
For an event on the same day no problem. But over several days I manage to approximate the result with "generate_series".u Is there another way faster and more precise?
Interval In:
Start---------------------------------------End
E.begin_date ------ E.end_date -> E.end_date - E.begin_date
Start--------------------------------End
E.begin_date -------------------E.end_date -> End - E.begin_date
Start--------------------------------End
E.begin_date -----------------------E.end_date -> E.end_date - Start
Start------------------End
E.begin_date-------------------------------------E.end_date -> End - Start
Interval In on several Days with generate_series
Day 1 Day 2 etc....
| |
Start-------End | Start---------End | Start---------End
E.begin_date--------------------------------------------------------------E.end_date
Interval Out:
Start--------End
E.begin_date ------------------- E.end_date -> (Start - E.begin_date) + (E.end_date - End)
Start--------------------------------End
E.begin_date -------------------E.end_date -> E.end_date - End
Start--------------------------------End
E.begin_date -----------------------E.end_date -> Start - .begin_date
Start--------End
E.begin_date--------E.end_date -> E.end_date - E.begin_date
Interval Out on several Days with generate_series
Day 1 Day 2 etc....
| |
Start-------End | Start---------End | Start---------End
E.begin_date--------------------------------------------------------------E.end_date
This is my sql request:
select * FROM (SELECT sum(
case
when (begin_date AT TIME ZONE 'Europe/Paris')::date = (end_date AT TIME ZONE 'Europe/Paris')::date -- same days
then
case
-- ------time_begin------------------------------------------------------------------------time_end----
-- begin_date------------------------------------------------end_date
when (begin_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time
AND
(end_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time
then (end_date AT TIME ZONE 'Europe/Paris')::time - (begin_date AT TIME ZONE 'Europe/Paris')::time
-- ------time_begin------------------------------------------------------------------------time_end----
-- begin_date------------------------------------------------end_date
when (begin_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time
AND
(begin_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time
AND
(end_date AT TIME ZONE 'Europe/Paris')::time >= '18:00'::time
then '18:00'::time - (begin_date AT TIME ZONE 'Europe/Paris') ::time
-- -----------------time_begin------------------------------------------------------------------------time_end----
-- begin_date-----------------------------------end_date
when (end_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time
AND
(end_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time
AND
(begin_date AT TIME ZONE 'Europe/Paris')::time <= '08:00'::time
then (end_date AT TIME ZONE 'Europe/Paris')::time - '08:00'::time
-- -----------------time_begin-------------------------------------time_end----
-- begin_date----------------------------------------------------------------end_date
when (begin_date AT TIME ZONE 'Europe/Paris')::time <= '08:00'::time
AND
(end_date AT TIME ZONE 'Europe/Paris')::time >= '18:00'::time
then '18:00'::time - '08:00'::time
END
when (begin_date AT TIME ZONE 'Europe/Paris')::date < (end_date AT TIME ZONE 'Europe/Paris')::date -- on many days
then end_date - begin_date
(select COUNT(*) * interval '5 min'
FROM generate_series(begin_date AT TIME ZONE 'Europe/Paris', end_date AT TIME ZONE 'Europe/Paris', interval '5 min') m
where (m::time AT TIME ZONE 'Europe/Paris') >= '08:00'::time
AND (m::time AT TIME ZONE 'Europe/Paris') <= '18:00'::time
AND EXTRACT(dow FROM m) + 1 in (2, 3)
)
END) as total_included
FROM "event"
WHERE ( (EXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) OR
EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) )
AND ("event"."begin_date" AT TIME ZONE 'Europe/Paris')::date >= '2019-12-01T00:00:00'::date
AND ("event"."begin_date" AT TIME ZONE 'Europe/Paris')::date <= '2020-01-01T00:00:00'::date
AND (
((event.begin_date AT TIME ZONE 'Europe/Paris')::time between '08:00'::time AND '18:00'::time)
OR
((event.end_date AT TIME ZONE 'Europe/Paris')::time between '08:00'::time AND '18:00'::time)
OR (
((event.begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time)
AND
((event.end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time))
))
order by total_included desc) as included,
(SELECT sum(
case
when (begin_date AT TIME ZONE 'Europe/Paris')::date = (end_date AT TIME ZONE 'Europe/Paris')::date -- same day
then
case
-- -----------------------------------------------------------------time_begin---------------------------------time_end----
-- begin_date-------------------------------end_date
-- or
-- -------------------------------time_begin---------------------------------time_end-----------
-- begin_date----------------------------------------------------------------------end_date
when ((begin_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time)
AND
((end_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time)
then
case
when ((begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time)
then end_date::time - begin_date::time
when (begin_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time
then end_date::time - begin_date::time
when (begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time
then ('08:00'::time - (begin_date AT TIME ZONE 'Europe/Paris')::time) + ((end_date AT TIME ZONE 'Europe/Paris')::time - '18:00'::time)
end
-- --------------------------------------time_begin---------------------------------time_end----
-- begin_date-----------------------------------end_date
when ((begin_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time)
AND
((end_date AT TIME ZONE 'Europe/Paris')::time BETWEEN '08:00'::time AND '18:00'::time)
then '08:00'::time - (begin_date AT TIME ZONE 'Europe/Paris')::time
-- ----------time_begin---------------------------------time_end----
-- begin_date-----------------------------------end_date
when ((begin_date AT TIME ZONE 'Europe/Paris')::time BETWEEN '08:00'::time AND '18:00'::time)
AND
((end_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time)
then (end_date AT TIME ZONE 'Europe/Paris')::time - '18:00'::time
end
when (begin_date AT TIME ZONE 'Europe/Paris')::date < (end_date AT TIME ZONE 'Europe/Paris')::date --many days
then
(select COUNT(*) * interval '5 min'
FROM generate_series(begin_date , end_date, interval '5 min') m
where (m AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time
AND (m AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time
AND EXTRACT(dow FROM m AT TIME ZONE 'Europe/Paris') + 1 in (2, 3)
)
END) as total_excluded
FROM "event"
WHERE ((EXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) OR
EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) )
AND ("event"."date_creation" AT TIME ZONE 'Europe/Paris')::date >= '2019-12-01T00:00:00'::date
AND ("event"."date_creation" AT TIME ZONE 'Europe/Paris')::date <= '2020-01-01T00:00:00'::date
AND (
((event.begin_date AT TIME ZONE 'Europe/Paris')::time not between '08:00'::time AND '18:00'::time)
OR
((event.end_date AT TIME ZONE 'Europe/Paris')::time not between '08:00'::time AND '18:00'::time)
OR (
((event.begin_date AT TIME ZONE 'Europe/Paris')::time > '08:00'::time)
AND
((event.end_date AT TIME ZONE 'Europe/Paris')::time < '18:00'::time))
))
order by total_excluded desc) as excluded
Data Sample for interval IN ('8am', '6pm') for Wednesday Thursday:
|---------|------------------|------------------|------------------|
| id | begin_date | end_date | Result(Duration) |
|---------|------------------|------------------|------------------|
| 1 | 2020-01-01 10:00 | 2020-01-01 12:00 | 02:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 2 | 2020-01-01 10:00 | 2020-01-01 20:00 | 08:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 3 | 2020-01-01 07:00 | 2020-01-01 14:00 | 06:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 4 | 2020-01-01 07:00 | 2020-01-01 19:00 | 10:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 5 | 2020-01-01 08:00 | 2020-01-03 18:00 | 20:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 6 | 2020-01-01 09:00 | 2020-01-05 17:00 | 16:00:00 |
|---------|------------------|------------------|------------------|
Data Sample for interval OUT ('8am', '6pm') for Wednesday Thursday:
|---------|------------------|------------------|------------------|
| id | begin_date | end_date | Result(Duration) |
|---------|------------------|------------------|------------------|
| 1 | 2020-01-01 10:00 | 2020-01-01 12:00 | 00:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 2 | 2020-01-01 10:00 | 2020-01-01 20:00 | 02:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 3 | 2020-01-01 07:00 | 2020-01-01 14:00 | 01:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 4 | 2020-01-01 07:00 | 2020-01-01 19:00 | 02:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 5 | 2020-01-01 07:00 | 2020-01-03 18:00 | 02:00:00 |
|---------|------------------|------------------|------------------|
|---------|------------------|------------------|------------------|
| 6 | 2020-01-01 09:00 | 2020-01-05 17:00 | 02:00:00 |
|---------|------------------|------------------|------------------|
The input parameters are:
- start date and end date
- start time and end time
- a list of the days of the week
To reach a result where we have the total duration of events within the interval, but also the total duration of events outside the interval of hours. All this for a list of days of the given week.
Days of the week means that if an event lasts several days say from Monday to Wednesday, but in the filter there are only Monday and Tuesday, the duration of Wednesday will not be counted.
The current query works but to handle events that take place over several days I use generate_series
, which is not very efficient. The question is how to improve this query.
Best Answer
Interpretation of the question
Given that an event starts at
begin_date
and is finished atend_date
, the goal is to map this duration onto days/hours that seem like hours open to business, presumably to compute how much work time each event consumed.Based on the query provided, the working hours are
08:00-18:00
. It's not clear how you determine the working days. Sometimes the condition in the query isEXTRACT(dow FROM m) + 1 in (1)
, and at other places it'sEXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2) OR EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2) )
The question says that there is a list of the days of the week in input but I don't see that in the query.
Solution with intersection of ranges
Below is a query that demonstrates how to use ranges to sum your "IN" intervals.
What it does:
08:00-18:00
ranges for each day between the start and end of eventQuery
Result
With the sample data the result is
This matches the question except for id=6, but that might be because the
dow
filter doesn't match yours.As for the OUT intervals, if I understand correctly I think it's just a difference, that is
OUT = (end_date - begin_date) - IN
for each event.