Postgresql – How to simplify a carry over values process

postgresql

I have created a "sample case" here for Postgres 10.0 (actually I am using AWS equivalent for 10.1):

https://www.db-fiddle.com/#&togetherjs=EKTJ6eQ62V

where you can find the table:

create table test(l text, v1 integer, v2 real, monthly_date date);

insert into test 
    values 
    ('a', 2, 1.3, '2001-01-01'),
    ('a', 1, 2.2, '2001-02-01'),
    ('a', 5, 6.2, '2001-04-01'),
    ('b', 3, 9.0, '2001-03-01');

The expected output is:

l   v1  v2  monthly_date
a   2   1.3 2001-01-01T00:00:00.000Z
a   1   2.2 2001-02-01T00:00:00.000Z
a   0   2.2 2001-03-01T00:00:00.000Z
a   5   6.2 2001-04-01T00:00:00.000Z
a   0   6.2 2001-05-01T00:00:00.000Z
a   0   6.2 2001-06-01T00:00:00.000Z
b   3   9   2001-03-01T00:00:00.000Z
b   0   9   2001-04-01T00:00:00.000Z
b   0   9   2001-05-01T00:00:00.000Z
b   0   9   2001-06-01T00:00:00.000Z

In a monthly time range that goes from '2001-01-01' to '2001-06-01', if there are missing following months, they get filled with the previous month values. The only difference is in the column 'v1' where the value for the missing month gets replaced with 0.

The query that I am using at the moment is:

WITH
    md AS (
      SELECT *,
      LEAD(monthly_date) OVER (PARTITION BY l ORDER BY monthly_date) AS next_date
      FROM test
    ),
    calendar AS (
      select interval_date::date 
      from generate_series('2001-01-01'::date, '2001-06-01'::date, '1 month'::interval) interval_date
    )
    select T.l, coalesce (m2.v1, 0) as v1, T.v2, T.interval_date as monthly_date
    from (
        SELECT m.l, m.v1, m.v2, c.interval_date
        FROM calendar c
        JOIN md m
            ON c.interval_date BETWEEN m.monthly_date AND
            (CASE WHEN m.next_date IS NULL THEN '2001-06-01' ELSE m.next_date - '1 month'::interval END)
    ) T
    LEFT JOIN md m2 ON m2.l = T.l AND m2.monthly_date = T.interval_date;

which works fine but it is using a "LEFT JOIN" to inject the value '0' for the column 'v1'. Is there a better (possibly more efficient) way to do that?

Best Answer

You should check it, I can't assure this is better in terms or efficiency.

with x as -- all possible combinations 
(
    select distinct l, m
    from   test tt
    join   lateral (select m from
                    generate_series((select min(monthly_date) from test where l=tt.l), 
                                     '2001-06-01'::date, '1 month'::interval) m) t1 on true
)
select    x.l, 
          coalesce(test.v1, 0) as v1,
          coalesce(test.v2, (select v2 -- only executed for missing rows
                             from test 
                             where l = x.l and monthly_date < x.m 
                             order by l, monthly_date 
                             desc limit 1)) as v2,
          m as monthly_date
from      x
left join test
on        test.l = x.l
and       date_trunc('month', monthly_date) = date_trunc('month', m)
order by  x.l, x.m;


l  | v1 | v2  | monthly_date          
:- | -: | :-- | :---------------------
a  |  2 | 1.3 | 2001-01-01 00:00:00+00
a  |  1 | 2.2 | 2001-02-01 00:00:00+00
a  |  0 | 2.2 | 2001-03-01 00:00:00+00
a  |  5 | 6.2 | 2001-04-01 00:00:00+01
a  |  0 | 6.2 | 2001-05-01 00:00:00+01
a  |  0 | 6.2 | 2001-06-01 00:00:00+01
b  |  3 | 9   | 2001-03-01 00:00:00+00
b  |  0 | 9   | 2001-04-01 00:00:00+01
b  |  0 | 9   | 2001-05-01 00:00:00+01
b  |  0 | 9   | 2001-06-01 00:00:00+01

dbfiddle here