Question: I would like to create a query that can return a list of all days of the month with a sum of all duration for each day including zero for each day that has no data.
Database: PostgreSQL 9.x
Table structure (only relevant columns shown):
-------------------------------------------------------------
| id | start_at | finish_at | duration |
-------------------------------------------------------------
| 1 | 2015-08-14 14:01:00 | 2015-08-15 13:59:00 | 86280 |
-------------------------------------------------------------
I have the following query for grabbing the data I want but the part that is missing is the days between with zero and I'm not entirely sure on the best approach to achieve the desired outcome.
SELECT date_trunc('day', start_at) AS "day" , sum(duration) AS "duration"
FROM time_logs
JOIN users ON time_logs.user_id = users.id
JOIN account_users ON account_users.user_id = users.id
JOIN categories ON categories.id = time_logs.activity_id
WHERE account_users.account_id IN (1,2,3)
AND time_logs.start_at BETWEEN '2017-03-01 00:00:00 +1000' AND '2017-03-31 00:00:00 +1000'
AND categories.report_group = 'Segment1'
GROUP BY 1
ORDER BY 1
Best Answer
Add an outer join to a complete series of days using
generate_series()
:And
COALESCE
to get 0 ("zero") instead of NULL, if you need that.BETWEEN '2017-03-01 00:00 +1' AND '2017-03-31 00:00 +1'
would exclude the last day of march (except for its very first µs). I changed it to include the whole month.AND t.start_at < '2017-04-01 00:00 +1'
has the additional advantage that you don't need to adapt the day of the upper bound, just the month.Since
start_at
is obviouslytimestamptz
, that's the most efficient call with matching data type. Detailed explanation here:The actual data type of
start_at
and your actual time zone setting might lead to disagreement when days start. Consider: