Postgresql – Creating a GroupBy query to include a result for when no results match

postgresqlquery

Let's say I need a query to see sales results by Month and I want the result-set to include results for each month whether there were any sales in that month or not.

And let's say my table "sales" has the columns "sales_date" and "sales_amount" and I want a result-set to look like:

Jan, 100.00
Feb, 240.00
Mar, 0.00
Apr, 430.00

So, if I want to sum the values by month and there aren't any records for Mar, I still want to see a result for Mar that's 0. How would I do that?

Best Answer

The usual way to get a series in postgres is with generate_series. This function produces a series of integers or timestamps - you can use either but assuming your 'dates' are really timestamptz, here's how you might go about it if you are on 8.4 or above:

testbed:

create table sales(sales_date timestamptz, sales_amount numeric);
insert into sales(sales_date, sales_amount) values('2011-01-15 12:00', 100);
insert into sales(sales_date, sales_amount) values('2011-02-15 12:00', 240);
insert into sales(sales_date, sales_amount) values('2011-04-15 12:00', 400);
insert into sales(sales_date, sales_amount) values('2011-04-16 12:00', 30);

query:

with w as ( select month, sum(sales_amount) as total
            from (select date_trunc('month',sales_date) as month, sales_amount from sales) z
            group by month )
select to_char(month, 'fmMon') as month, coalesce(total, 0) as total
from (select generate_series(min(month), max(month), '1 month'::interval) as month from w) m
     left outer join w using(month);

result:

 month | total
-------+-------
 Jan   |   100
 Feb   |   240
 Mar   |     0
 Apr   |   430

--edit: a bit of extra detail on the query:

  • produce a summary of sales by month (but no month present if no sales):

    with w as ( select month, sum(sales_amount) as total
                from ( select date_trunc('month',sales_date) as month, sales_amount
                       from sales ) z
                group by month )
    

    which could alternatively be written as:

    with w as ( select date_trunc('month',sales_date) as month, sum(sales_amount) as total
                from sales
                group by date_trunc('month',sales_date) )
    
  • produce an unbroken series of months (without sales) from the minimum to the maximum:

    select generate_series(min(month), max(month), '1 month'::interval) as month from w
    
  • outer join the unbroken series to the summary of sales by month to produce an unbroken series with sales (or null sales if no sales present):

    left outer join w using(month)
    
  • for the months with null sales, change the null to a 0:

    coalesce(total, 0)