So I want to get cumulative data, for every day since a start date until an end date (or today). I have constructed a query that will get the data from a table, but for some reason, some dates are missing. My left join isn't working and I am not getting the desired results. Can anybody point out my mistake?
(also, as a small detail, how do I only get the date and not the timestamp?)
Current result (note the missing dates):
date | sum
------------------------+----------
2012-09-01 00:00:00+02 | 8.000
2012-09-03 00:00:00+02 | 29.300
2012-09-07 00:00:00+02 | 31.300
Expected result (a missing date behaves as distance 0, so sum shouldn't increase):
date | sum
------------------------+----------
2012-09-01 00:00:00+02 | 8.000
2012-09-02 00:00:00+02 | 8.000
2012-09-03 00:00:00+02 | 29.300
2012-09-04 00:00:00+02 | 29.300
2012-09-05 00:00:00+02 | 29.300
2012-09-06 00:00:00+02 | 29.300
2012-09-07 00:00:00+02 | 31.300
Query:
WITH dates as (SELECT min(date) as start_date,
max(date) as end_date
from training_training
where athlete_id = 1)
SELECT distinct(d.date),
sum(distance) OVER (ORDER BY d.date)
FROM
(
SELECT generate_series(start_date, end_date, interval '1 day') as date
FROM dates
) d
LEFT JOIN training_training
ON d.date = training_training.date
WHERE athlete_id = 1
AND kind in ('t', 'd', 'i', 'w')
GROUP BY d.date, distance
ORDER BY d.date
Best Answer
As inspired by a_horse_with_no_name, I moved the right part of the join to a subquery so the where clause doesn't interfere with the join condition
This works as intended: