PostgreSQL select count with dynamic date range

postgresqlpostgresql-9.6

Firstly, I am new to this forum. I hope that someone can help or advise me on my issue.

Given a User createdAt date and the current date. I would like to retrieve the sum of Friends records for the user with 10 linear time intervals between these dates.

So if the createdAt was 30 days ago then I want the sum of users Friends over EVERY 3 day interval.

User Table

╔════╤══════════╤════════════════════════════╗
║ id │ name     │ createdAt                  ║
╠════╪══════════╪════════════════════════════╣
║ 1  │ Timothee │ 2018-03-01 13:02:20.904+00 ║
╚════╧══════════╧════════════════════════════╝

Friends Table

╔════╤════════╤══════════╤════════════════════════════╗
║ id │ userId │ friendId │ createdAt                  ║
╠════╪════════╪══════════╪════════════════════════════╣
║ 1  │ 1      │ 234      │ 2018-03-20 15:41:51.779+00 ║
╟────┼────────┼──────────┼────────────────────────────╢
║ 2  │ 1      │ 254      │ 2018-03-20 16:16:34.698+00 ║
╟────┼────────┼──────────┼────────────────────────────╢
║ 3  │ 1      │ 288      │ 2018-03-15 15:17:39.907+00 ║
╟────┼────────┼──────────┼────────────────────────────╢
║ 4  │ 1      │ 293      │ 2018-03-07 16:15:49.379+00 ║
╚════╧════════╧══════════╧════════════════════════════╝

Expected Output

╔═╤═══════╤════════════════════════════╤════════════════════════════╗
║ │ count │ startDate                  │ endDate                    ║
╠═╪═══════╪════════════════════════════╪════════════════════════════╣
║ │ 0     │ 2018-03-01 13:02:20.904+00 │ 2018-03-03 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 0     │ 2018-03-03 13:02:20.904+00 │ 2018-03-05 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 0     │ 2018-03-05 13:02:20.904+00 │ 2018-03-07 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 1     │ 2018-03-07 13:02:20.904+00 │ 2018-03-09 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 1     │ 2018-03-09 13:02:20.904+00 │ 2018-03-11 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 1     │ 2018-03-11 13:02:20.904+00 │ 2018-03-13 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 1     │ 2018-03-13 13:02:20.904+00 │ 2018-03-15 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 2     │ 2018-03-15 13:02:20.904+00 │ 2018-03-17 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 2     │ 2018-03-17 13:02:20.904+00 │ 2018-03-19 13:02:20.904+00 ║
╟─┼───────┼────────────────────────────┼────────────────────────────╢
║ │ 4     │ 2018-03-19 13:02:20.904+00 │ 2018-03-21 13:02:20.904+00 ║
╚═╧═══════╧════════════════════════════╧════════════════════════════╝

I can get the desired day interval with the following SQL-Server Query:

@DECLARE @EndDate DATE = GETDATE();
WITH start as 
(SELECT createdAt
 FROM user where id = '123'
) 
SELECT diff/10 
FROM ( 
    SELECT DATEDIFF(days, start, @EndDate) 
) as diff

How would this be in PostgreSQL?

I also know that I can retrieve the count of records between a static date range with SQL-Server Query:

SELECT count(*)
FROM friends
WHERE userId = '123'
AND date BETWEEN CURDATE() - INTERVAL 3 DAY AND CURDATE()

How would this be in PostgreSQL?

But what I am struggling with is the combining of these two things.

How can I retrieve the sum of records over ALL date intervals between the two dates?

I hope the question is understandable, thank you for any help or advice.

Best Answer

Postgres allows to generate a series of dates using generate_series() function.

select generate_series(createdAt, current_date, 
       (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval) ddate
       , (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval as ddays                
from   users
where  id = 1;

It returns:

ddate                   | ddays 
:---------------------- | :-----
2018-03-01 13:02:20.904 | 2 days
2018-03-03 13:02:20.904 | 2 days
2018-03-05 13:02:20.904 | 2 days
2018-03-07 13:02:20.904 | 2 days
2018-03-09 13:02:20.904 | 2 days
2018-03-11 13:02:20.904 | 2 days
2018-03-13 13:02:20.904 | 2 days
2018-03-15 13:02:20.904 | 2 days
2018-03-17 13:02:20.904 | 2 days
2018-03-19 13:02:20.904 | 2 days

Then you can use previous query to count friends between this intervals:

with x as
(
    select generate_series(createdAt, current_date, 
           (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval) ddate
           , (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval as ddays
    from   users
    where  id = 1
)
select ddate,
       (select count(*)
        from   friends
        where  createdAt >= ddate
        and    createdAt <  ddate + ddays) friends
from x;

Final result:

ddate                   | friends
:---------------------- | ------:
2018-03-01 13:02:20.904 |       0
2018-03-03 13:02:20.904 |       0
2018-03-05 13:02:20.904 |       0
2018-03-07 13:02:20.904 |       1
2018-03-09 13:02:20.904 |       0
2018-03-11 13:02:20.904 |       0
2018-03-13 13:02:20.904 |       0
2018-03-15 13:02:20.904 |       1
2018-03-17 13:02:20.904 |       0
2018-03-19 13:02:20.904 |       2

Or if you prefer a join instead of a correlated scalar subquery:

with x as
(
    select generate_series(createdAt, current_date, 
           (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval) ddate
           , (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval as ddays
    from   users
    where  id = 1
)
select ddate,
       count(friends.*) as friends
from   x
left join   friends
on     createdAt >= ddate
and    createdAt <  ddate + ddays
group by ddate
order by ddate;

This last option can be further modified to returned running totals, as per your expected output:

with x as
(
    select generate_series(createdAt, current_date, 
           (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval) ddate
           , (date_part('day', current_date::timestamp - createdAt::timestamp) / 10)::int * '1 day'::interval as ddays
    from   users
    where  id = 1
)
select ddate,
       count(friends.*) as friends,
       sum(count(friends.*)) over (order by ddate) as friends_total
from   x
left join   friends
on     createdAt >= ddate
and    createdAt <  ddate + ddays
group by ddate
order by ddate;

Result:

ddate                   | friends | friends_total
:---------------------- | ------: | ------------:
2018-03-01 13:02:20.904 |       0 |             0
2018-03-03 13:02:20.904 |       0 |             0
2018-03-05 13:02:20.904 |       0 |             0
2018-03-07 13:02:20.904 |       1 |             1
2018-03-09 13:02:20.904 |       0 |             1
2018-03-11 13:02:20.904 |       0 |             1
2018-03-13 13:02:20.904 |       0 |             1
2018-03-15 13:02:20.904 |       1 |             2
2018-03-17 13:02:20.904 |       0 |             2
2018-03-19 13:02:20.904 |       2 |             4

dbfiddle here