Postgresql split date range by business days then aggregate by month

aggregatepostgresql

I have the following rows in a table (dd-mm-yyyy).

Start      | End        | Value
---------------------------------
01-01-2019 | 31-03-2019 | 64
01-02-2019 | 30-04-2019 | 126
01-03-2019 | 31-05-2019 | 66

I would like to divide the values by number of working days (just mon-fri, not holidays) between the start and end dates, then aggregate the values that occur in each month in postgres.

Based on this year's working days per month being:

Month  | Working Days
--------------------
Jan 19 | 23
Feb 19 | 20
Mar 19 | 21
Apr 19 | 22
May 19 | 23
  • The 1st row has 64 working days in Jan, Feb, Mar – so it's value per working day is 1. And has an aggregate value of 23 in Jan, 20 in Feb, 21 in Mar.
  • The 2nd row has 126 working days in Feb, Mar, Apr – so it's value per working day is 2. And has an aggregate value of 40 in Feb, 42 in Mar, 44 in Apr.
  • The 3rd row has 66 working days in Mar, Apr, May – so it's value per working day is 1. And has an aggregate value of 21 in Mar, 22 in Apr, 23 in May.
  • If we sum all the aggregate values for each month for each row we should get the result below.
Month   | Value
----------------
01-2019 | 23
02-2019 | 60
03-2019 | 84
04-2019 | 66
05-2019 | 23

So it's like grouping by month but the values are weighted by the number of working days per month.

is it possible to do this in postgres?

Best Answer

Assuming the table has a primary (or unique) key, then you can calculate the "value per day" using this:

select to_char(dt, 'yyyy-mm') as month, 
       value, 
       value::numeric / count(*) over (partition by id) as value_per_day
from data
  left join generate_series(start, "end", interval '1 day') as t(dt) on true
where extract(dow from dt) not in (0,6) ;

generate_series() generates rows for each day between start and end, and the where clause removes the weekends. id is the unique identifier for the original rows it is required to be able to count the number of days "per row". With that result we can now aggregates the final result:

select month, sum(value_per_day)
from ( 
  select to_char(dt, 'yyyy-mm') as month, 
         value, 
         value::numeric / count(*) over (partition by id) as value_per_day
  from data
    left join generate_series(start, "end", interval '1 day') as t(dt) on true
  where extract(dow from dt) not in (0,6)  
) t
group by month
order by month;

Online example: https://rextester.com/KPGOFU25611