Postgresql – Join table into different columns

aggregate-filterjoin;pivotpostgresql

I have two tables (simplified).

table1:  
 - id (serial)

table2:  
 - table1_id (integer)  
 - value (text)  
 - type (integer)

I want to join these two tables, but depending on the type, I want to join the "value" of table2 into different columns. Result should look like this:

id | table2value0 | table2value1 | table2othervalues

table2value0 is the value from table2 where type is 0.
table2value1 is the value from table2 where type is 1.
table2othervalues are the remaining values where type is something else, as an array.

If there are more than one value for table2value0 and table2value1, I only want the first. Order doesn't matter, so the order it was inserted.

If some entries in table1 do not have matches in table2, they should appear in the output as NULL.

Best Answer

Assuming Postgres 9.4 or later:

SELECT id, t2.table2value0, t2.table2value1, t2.table2othervalues
FROM   table1 t1
LEFT   JOIN (
   SELECT table1_id AS id
        , min(value) FILTER (WHERE type = 0) AS table2value0
        , min(value) FILTER (WHERE type = 1) AS table2value1
        , array_agg(value) FILTER (WHERE type IN (1,2) IS NOT TRUE) AS table2othervalues
   FROM   table2
   GROUP  BY table1_id
   ) t2 USING (id);

The aggregate FILTER clause was introduced with pg 9.4. There are various less elegant alternatives for older versions:

Postgres Nested WHEN Aggregate Function

This returns one row for every row in table1, missing types in table2 result in NULL values.

If there are more than one value for table2value0 and table2value1, I only want the first.

You seem to be under the impression that there is a natural order of rows, but there is not:

Does Postgres preserve insertion order of records?

I picked the minimum value instead, since your requirement is effectively undefined.

The best query depends on your actual table definition (including all constraints).

In particular, the expression type IN (1,2) IS NOT TRUE also catches type IS NULL, which may not be necessary, depending on your missing table definition.

Aside:
Your desire to return a single (scalar) value for types 1 and 2, but an array for the rest makes it harder to solve with crosstab() (but still possible). Basics:

PostgreSQL Crosstab Query

Related Solutions

Postgresql – Retrieving data from inner join using LIMIT and OFFSET

Use a subquery (as displayed) or CTE for that purpose:

SELECT *
FROM  (
   SELECT qid, gid
   FROM   table1
   ORDER  BY date DESC
   LIMIT  10
   OFFSET ?
   ) q
JOIN   table2 a USING (qid, gid)

USING (qid, gid) is just a shortcut for ON q.qid = a.qid AND q.gid = a.gid with the side effect that the two columns are only included once in the result.

Mysql – Subquery or multiple left joins, or both

There are many ways to do this. Most common are with a LEFT JOIN / IS NULL check, NOT IN or NOT EXISTS subquery. Here's a solution with the 3rd option:

SELECT t2.id, t2.name 
FROM table2 AS t2
WHERE NOT EXISTS
      ( SELECT *
        FROM table3 AS t3
        WHERE t3.table1_id = @t1_id     -- the t1.id value (choice) that you want to check
          AND t3.table2_id = t2.id
      ) ;

Best Answer

Related Solutions

Postgresql – Retrieving data from inner join using LIMIT and OFFSET

Mysql – Subquery or multiple left joins, or both

Related Question