Postgresql – Why does COALESCE function not work for this query

coalescejoin;postgresqlsubquery

I am trying to count the number of columns that is retrieved from the subquery, and some subqueries may be empty. So instead of displaying it as an empty cell, I want it to be 0. Found out that I can do this by using the COALESCE function, but for example this query still gives me an empty cell:

CREATE VIEW third_table AS 
SELECT * 
FROM  (second_table  
NATURAL FULL JOIN (
   SELECT assignment_id, COALESCE( count(*), 0 ) AS num_60_79 
   FROM (
      SELECT assignment_id, mark_as_percent 
      FROM avg_required_table
      WHERE mark_as_percent >= 60 
      AND mark_as_percent < 80
      ) a 
   GROUP BY assignment_id
   ) b);

Is this how COALESCE is supposed to be used?

Best Answer

To answer your question

COALESCE would be utterly pointless where you placed it to begin with. The aggregate function count() never returns NULL. And logic dictates that num_60_79 can never be lower than 1 in the subquery. Related:

But that's irrelevant, because if a subquery returns no row, then it cannot return any values at all.

To be precise, in your case the subquery may return any number of rows but, when joining to second_table, if there is no row with matching values in (assignment_id, num_60_79) then all columns from the subquery b are filled with NULL values instead. Related:

NATURAL FULL JOIN is a very exotic way to join tables. Especially for somebody still learning how to use COALESCE. Quoting the manual:

NATURAL is shorthand for a USING list that mentions all columns in the two tables that have the same names.

and:

FULL OUTER JOIN returns all the joined rows, plus one row for each unmatched left-hand row (extended with nulls on the right), plus one row for each unmatched right-hand row (extended with nulls on the left).

I think it's safe to assume there is no column named num_60_79 in second_table and you surely wouldn't want to involve it in the join conditions if there was one. You only want to join on the column assignment_id, and that's what you should put in the query instead of the NATURAL key word.

Also, while FULL JOIN is theoretically possible here, we would typically see a LEFT JOIN.

You also don't need two layers of subqueries and some other noise in the query.

Proper query

All things considered, while you are ...

trying to count the number of columns that is retrieved from the subquery

... my educated guess is you want this query instead:

SELECT s.*, COALESCE(b.num_60_79, 0) AS num_60_79
FROM   second_table s
LEFT   JOIN (
   SELECT assignment_id, count(*) AS num_60_79 
   FROM   avg_required_table
   WHERE  mark_as_percent >= 60 
   AND    mark_as_percent <  80
   GROUP  BY assignment_id
   ) b USING (assignment_id);