First due you need to generate a series of dates, I'd suggest to use a calendar
table.
CREATE TABLE if not exists calendar (
mdate date PRIMARY KEY NOT NULL
);
INSERT INTO calendar values
('20170403'),('20170404'),('20170405'),('20170406'),('20170407'),('20170408');
How do they do it
Just to get overlapped activities I've used the query that you provided on your question.
create view overlaped_activities
as
SELECT name, min(start) AS start, end, TIMESTAMPDIFF(MINUTE, MIN(start), end) AS time
FROM (
SELECT x.name, x.start, min(y.end) AS end
FROM activities AS x
JOIN activities AS y
ON x.name = y.name
AND x.start <= y.end
AND NOT EXISTS (
SELECT 1
FROM activities AS z
WHERE y.name = z.name
AND y.end >= z.start
AND y.end < z.end
)
WHERE NOT EXISTS (
SELECT 1
FROM activities AS u
WHERE x.name = u.name
AND x.start > u.start
AND x.start <= u.start
)
GROUP BY x.name, x.start
) AS v GROUP BY name, end;
First I calculate minutes from start date till midnight:
if(date(start) = date(end),
time_to_sec(timediff(end, start)) / 60,
(1440 - time_to_sec(time(start)) / 60)) mstart
Then, if start <> end, I calculate minutes from midnight till end date:
if(date(start) = date(end), 0, time_to_sec(time(end)) / 60) mend
This returns a table like this:
| start | end | mdiff | mstart | mend |
|---------------------|---------------------|----------:|--------:|--------:|
| 03.04.2017 11:00:00 | 03.04.2017 11:30:00 | 30,0000 | 30,0000 | 0 |
| 03.04.2017 23:45:00 | 04.04.2017 00:15:00 | 30,0000 | 15,0000 | 15,0000 |
| 04.04.2017 10:00:00 | 04.04.2017 11:30:00 | 90,0000 | 90,0000 | 0 |
| 05.04.2017 23:00:00 | 07.04.2017 00:45:00 | 1545,0000 | 60,0000 | 45,0000 |
That's nice, but there is another problem here:
| 05.04.2017 23:00:00 | 07.04.2017 00:45:00 | 1545,0000 | 60,0000 | 45,0000 |
Of course: 1545 <> 60 + 45
We need to generate a series of dates between start and end date, and add 1440 minutes to each day.
We can get it using the calendar table:
select name,
mdate date_activity,
sum(1440) minutes
from calendar
join overlaped_activities
on calendar.mdate > date(start)
and calendar.mdate < date(end)
where datediff(end, start) > 1
group by name, mdate
Ok, we got all ingredients, it's time to cook the recipe:
select name, date_activity, sum(minutes) min_activity
from (
select name,
date(start) date_activity,
if(date(start) = date(end), time_to_sec(timediff(end, start)) / 60, (1440 - time_to_sec(time(start)) / 60)) minutes
from overlaped_activities
UNION ALL
select name,
date(end) date_activity,
if(date(start) = date(end), 0, time_to_sec(time(end)) / 60) minutes
from overlaped_activities
UNION ALL
select name,
mdate date_activity,
sum(1440) minutes
from calendar
join overlaped_activities
on calendar.mdate > date(start)
and calendar.mdate < date(end)
where datediff(end, start) > 1
group by name, mdate
) act
group by name, date_activity;
Final result:
| name | date_activity | min_activity |
|------|--------------------:|-------------:|
| me | 03.04.2017 00:00:00 | 45,0000 |
| me | 04.04.2017 00:00:00 | 105,0000 |
| me | 05.04.2017 00:00:00 | 60,0000 |
| me | 06.04.2017 00:00:00 | 1440,0000 |
| me | 07.04.2017 00:00:00 | 45,0000 |
Almost forget it, the recipe: http://rextester.com/EIJOI20983
Best Answer
Assuming data type
timestamp
. Some details are different fordate
ortimestamptz
.A general solution for any time interval can be based on the epoch value and integer division to truncate. Covers all your examples.
The special difficulty of your task: you want the ceiling, not the floor (which is much more common). Exercise care with lower and upper bounds to avoid corner case bugs: you don't want to increment exact floor values. (Or so I assume.)
For common time intervals built into
date_trunc()
(like1 hour
and1 day
in your examples) you can use a shortcut. The definition of days depends on the time zone setting of the session withtimestamptz
(but not withtimestamp
).A natural alternative is with
ceil()
. A bit slower in my tests, but cleaner.Short demo
The "trick" for the '5 min' calculation is to subtract the minimum time interval of 1 µs before truncating, and then add the respective time interval to effectively get the ceiling.
EXTRACT()
returns the number of seconds in the timestamp, adouble precision
number with fractional digits down to microseconds. We needtrunc()
because the plain cast tointeger
would round, while we need to truncate.This way we avoid incrementing timestamps that fall on the upper bound exactly. It is slightly dirty, though, because the minimum time interval is an implementation detail of current Postgres versions. Very unlikely to change though. Related:
The '10 min' calculation is simpler with
ceil()
, we don't need to shift bounds by subtracting 1 µs. Cleaner. Butceil()
is slightly more expensive in my tests.Extended test case
db<>fiddle here
I added an alternative shortcut for full days:
ts1::date + 1
. The cast todate
truncates to the full day and we can addinteger
1 to add a day.Function wrapper
You later disclosed you work with
timestamptz
, so we can dropAT TIME ZONE
from the expression.In my tests declaring the function
STABLE
yielded best performance because it allowed function inlining. I would have expectedIMMUTABLE
to be best, but that declaration is more picky about what's allowed inside to be inlined. Related:A bit faster in my tests:
Cleaner IMO:
Call:
For convenience, you could overload each function with an alternative taking an
interval
as$2
:Just invoking the first version with extracted seconds. Then you can also call: