Postgresql – How to properly merge the rows of several table with default values

join;postgresql

Problem

I have three tables that are structured like so:

t1

 id | count_1 
--------------
 1  | 9
 2  | 4 
 3  | 3

t2

 id | count_2 
--------------
 1  | 2
 3  | 3

t3

 id | count_3 
--------------
 1  | 1
 4  | 8

id is unique in each table. Note that not all ids occur in each table. Here is the SQL to create those tables if you'd like to test.

I'm trying to merge all those tables with a column for each count, defaulting to zero if there is no count for that particular id. Like this:

 id | count_1 | count_2 | count_3 
----------------------------------
 1  | 9       | 2       | 1       
 2  | 4       | 0       | 0       
 3  | 3       | 3       | 0       
 4  | 0       | 0       | 8       

Attempt

I thought this was a natural use case for a full outer join, like this:

SELECT
  COALESCE(t1.id, t2.id, t3.id) as id,
  COALESCE(t1.count_1, 0) as count_1,
  COALESCE(t2.count_2, 0) as count_2,
  COALESCE(t3.count_3, 0) as count_3
FROM
  t1
FULL OUTER JOIN t2
  ON t1.id = t2.id
FULL OUTER JOIN t3
  ON t1.id = t3.id
ORDER BY id ASC;

But this returns a result with non unique ids, where each row is just a row from one of the original tables with zeroes filling in the remaining columns:

 id | count_1 | count_2 | count_3 
----------------------------------
 1  | 9       | 0       | 0         # <- should
 1  | 0       | 2       | 0         # <- be
 1  | 0       | 0       | 1         # <- one row
 2  | 4       | 0       | 0       
 3  | 3       | 0       | 0         # <- should also be
 3  | 0       | 3       | 0         # <- one row
 4  | 0       | 0       | 8       

Evidently I don't understand outer joins as well as I thought I did. Can anyone show me the correct way to do this?

Best Answer

You could use FULL JOIN but the code gets a bit messy - at least for my taste. With 3 tables it's not so bad, you'd only need to change:

FULL OUTER JOIN t3
  ON t1.id = t3.id

to:

FULL OUTER JOIN t3
  ON COALESCE(t1.id, t2.id) = t3.id

but with more tables, it gets rather ugly. The other option is to gather all distinct id values and then LEFT JOIN all the tables:

SELECT
  d.id,
  COALESCE(t1.count_1, 0) AS count_1,
  COALESCE(t2.count_2, 0) AS count_2,
  COALESCE(t3.count_3, 0) AS count_3
FROM
  ( SELECT id FROM t1
  UNION
    SELECT id FROM t2
  UNION
    SELECT id FROM t3
  ) AS d
  LEFT JOIN t1 ON t1.id = d.id
  LEFT JOIN t2 ON t2.id = d.id
  LEFT JOIN t3 ON t3.id = d.id
ORDER BY id ;