PostgreSQL – Using JOIN to Return Count Including Zero

aggregatecountjoin;postgresqlpostgresql-9.6

I have this query:

select settings.customer_id, current::jsonb->'alarms' as alarms, from settings
join (select customer_id, current::jsonb->'alarms' as alarms from settings) as res on res.customer_id=settings.customer_id
where res.alarms is not null;

which returns

+------------------+
|customer_id|alarms|
|1          |{...} |
|2          |{...} |
|3          |{...} |
+------------------+

Then I have another entity registration, which fields customer_id and date, where date is of type date and I store the values in format YYYY-MM-DD.

What I want to achieve is to join the registration entity on customer_id and get a count on the registration records for a given single date, f.e. '2018-11-20'. But (here's the catchy part) I want to retrieve it in a way so if there are no registration records for the date, then return 0, otherwise return the actual number.

With the solutions I've tried so far, I would get only records that have the alarms AND have registrations as well, as if I have this:

select settings.customer_id, current::jsonb->'alarms' as alarms, coalesce(regs.count, 0) from settings
join (select customer_id, current::jsonb->'alarms' as alarms from settings) as res on res.customer_id=settings.customer_id
join (select customer_id, count(customer_id) from registration where date='2018-11-20' group by customer_id) as regs on settings.customer_id=reg.customer_id
where res.alarms is not null;

or

select settings.customer_id, current::jsonb->'alarms' as alarms, coalesce(count(regs.customer_id), 0) from settings
join (select customer_id, current::jsonb->'alarms' as alarms from settings) as res on res.customer_id=settings.customer_id
join registration as regs on settings.customer_id=regs.customer_id
where res.alarms is not null and regs.date='2018-10-10'
group by settings.customer_id, settings.current;

If I would put it in other words: get records which have alarms object and also get the number of their registrations for the given date, no matter if its 0 or > 0.

Best Answer

LEFT JOIN is the key, like ypercube commented.

And you probably don't need the self-join - assuming settings.customer_id is unique not null.

If your predicate is selective (more than a few rows without alarms), I suggest to count in a LATERAL subquery:

SELECT s.customer_id, s.current::jsonb->'alarms' AS alarms
     , coalesce(r.reg_ct, 0) AS reg_ct
FROM   settings s
LEFT   JOIN LATERAL (
   SELECT count(*) AS reg_ct
   FROM   registration
   WHERE  customer_id = s.customer_id
   AND    date = '2018-10-10'
   ) r ON true
WHERE  s.current::jsonb->'alarms' IS NOT NULL;

Else, a plain subquery should be faster:

SELECT s.customer_id, s.current::jsonb->'alarms' AS alarms
     , coalesce(r.reg_ct, 0) AS reg_ct
FROM   settings s
LEFT   JOIN (
   SELECT customer_id, count(*) AS reg_ct
   FROM   registration
   WHERE  date = '2018-10-10'
   GROUP  BY 1
   ) r USING (customer_id)
WHERE  s.current::jsonb->'alarms' IS NOT NULL;

Either way, it's typically faster and less error prone to aggregate before the join.
Related: