PostgreSQL – How to Group by ID and Calculate Average Time Delta

aggregatepostgresqlpostgresql-14

I have a table with ids and timestamps:

postgres=> select id_user, date_seconds from orders.orders order by id_user limit 10;
           id_user            |    date_seconds
------------------------------+---------------------
 00D4jzHHesfCXGO6VQfUun9JcCW2 | 2021-09-20 02:00:00
 00D4jzHHesfCXGO6VQfUun9JcCW2 | 2021-10-06 02:00:00
 00D4jzHHesfCXGO6VQfUun9JcCW2 |
 00d9QGWjmbYdRTGw4h4GynhDp2K2 | 2021-10-01 02:00:00
 00OI3ruCKeOFdIro11eVNcfp84A2 | 2021-10-06 02:00:00
 00OI3ruCKeOFdIro11eVNcfp84A2 | 2021-08-24 02:00:00
 00qwpdw23BSEBGkfJhR4Hb7Kg1v2 | 2021-09-17 02:00:00
 00qwpdw23BSEBGkfJhR4Hb7Kg1v2 | 2021-09-20 02:00:00
 00qwpdw23BSEBGkfJhR4Hb7Kg1v2 | 2021-10-04 02:00:00
 00qwpdw23BSEBGkfJhR4Hb7Kg1v2 | 2021-08-20 02:00:00
(10 rows)

As you can see, one id_user can appear multiple times in this table.

I would like to group this table by id_user. However, I would also want to have the average delta time of the date_seconds column between the rows of that particular id_user.

I figured I might need to use the LAG psql function, but I am unsure how or if I`m even on the right track here.

Examples as suggested by the comments:

Suppose I have the following table:

INSERT INTO orders(id_user, ordered_at) VALUES
('1', '2021-09-20 02:00:00'),
('1', '2021-10-06 02:00:00'),
('2', '2021-10-01 02:00:00'),
('3', '2021-10-06 02:00:00'),
('3', '2021-08-24 03:00:00'),
('4', '2021-08-20 02:00:00'),
('4', '2021-09-17 02:00:00'),
('4', '2021-09-20 02:00:00'),
('4', '2021-10-04 02:00:00');

The desired result would be the following:
The avg time between a user's orders can be calculated as follows:
AVG( time interval between orders )

example: id_user 1: AVG( 2021-09-20 02:00:00 - 2021-10-06 02:00:00) which equals 16 days and 0 hours interval
example: id user 4: ACG( (2021-08-20 02:00:00 - 2021-09-17 02:00:00) , (2021-09-17 02:00:00 - 2021-09-20 02:00:00) , (2021-09-20 02:00:00 - 2021-10-04 02:00:00)  )
example: id user 4: AVG( (interval '28 days') , (interval '3 days') , (interval '14 days')  )
example: id user 4: 28+3+14 / 3 = 15 --> result should be interval '15 days'

postgres=> SELECT magic(ordered_at) as avg_timedelta, id_user from orders where id_user='4' group by id_user;
            id_user            |    avg_timedelta
------------------------------+---------------------
              4                |      15 days

Best Answer

Something like:

select id_user, (max(ordered_at)-min(ordered_at)) / (count(1)-1)
from orders
group by id_user
having count(1) > 1;

If you want to use lead (or lag) instead, you can determine the duration between two consecutive rows and then apply avg on that set:

with t as (
    select id_user, ordered_at
         , lead(ordered_at) over (partition by id_user order by ordered_at) as ld_ordered_at
    from orders
)
select id_user, avg(ld_ordered_at - ordered_at) 
from t
where ld_ordered_at is not null
group by id_user;

Fiddle

The idea behind the first solution is that:

(x2-x1)+(x3-x2)+...+(x_n+1-x_n) <=>
(-x1+(x2-x2)+(x3-x3)+...+(x_n-1-x_n-1)+x_n) <=>
(x_n-x1)

and then divide with n-1 to get the average