PostgreSQL LEFT JOIN – Understanding LEFT JOIN Confusion

distinctjoin;postgresql

I am having issues with this simple scenario. I'm probably missing some basic concept…

I have 2 tables, the first of "things" and the second of orders. I've created a sqlfiddle: http://sqlfiddle.com/#!17/e9d19/6/0

Query:

select t1.*, t2.*
from things t1
left join things_orders t2
on t1.id = t2.thing_id

I want to have the results show a row for each item and person. In my example, Adam ordered an apple, 2 bananas and 3 cherries. Ben ordered no apples (no row in the database), 2 bananas and 3 cherries. I want the result to have 6 rows (simplified output):

Apple Adam 1
Banana Adam 2
Cherry Adam 3
Apple Ben null <-- wanted row, but not showing
Banana Ben 2
Cherry Ben 3

I didn't think I needed to have a row in the second table with a null value but maybe I do.

This is the DDL which is also in the sqlfiddle:

CREATE TABLE things (
    id smallint NOT NULL,
    name text COLLATE pg_catalog."default",
    CONSTRAINT things_pkey PRIMARY KEY (id)
);

CREATE TABLE things_orders (
    person text COLLATE pg_catalog."default" NOT NULL,
    thing_id smallint NOT NULL,
    qty integer,
    CONSTRAINT things_orders_pk PRIMARY KEY (person, thing_id)
);

INSERT INTO things VALUES 
  (1, 'Apple')
, (2, 'Banana')
, (3, 'Cherry')
;

INSERT INTO things_orders VALUES
  ('Adam', 1, 1)
, ('Adam', 2, 2)
, ('Adam', 3, 3)
, ('Ben', 2, 2)
, ('Ben', 3, 3)
;

Best Answer

Typically, you'd also have a table people holding exactly one row for every relevant person. ('Ben' like in the example is hardly unique.) If you don't, consider creating it. Burns down to a standard many-to-many relationship:

Then build the Cartesian product from people and things with a CROSS JOIN, and LEFT JOIN to things_orders:

SELECT p.id AS person_id, p.person, t.id as thing_id, t.name as thing, tp.qty
FROM   people p
CROSS  JOIN things t
LEFT   JOIN things_orders tp ON tp.thing_id  = t.id
                            AND tp.person_id = p.id;

Related:

Meanwhile, if things_orders is big and has many rows per person, getting unique persons with plain DISTINCT is expensive. And index on (person) helps but not much while Postgres does not yet support index skip scans. (Did not make it into Postgres 12, maybe in Postgres 13 ...)

Until then, a recursive CTE is the fast workaround:

WITH RECURSIVE persons AS (
   (
   SELECT person
   FROM   things_orders
   ORDER  BY person
   LIMIT  1
   )
   UNION ALL
   SELECT x.person
   FROM   persons p
   CROSS  JOIN LATERAL (
      SELECT person
      FROM   things_orders
      WHERE  person > p.person
      ORDER  BY person
      LIMIT  1
      ) x
   )
SELECT p.person, t.thing_id, t.name, tp.qty
FROM   persons p
CROSS  JOIN (SELECT id AS thing_id, name FROM things) t
LEFT   JOIN things_orders tp USING (thing_id, person);

db<>fiddle here

You may want COALESCE(tp.qty, 0) to show 0 instead of NULL ...

Related:

Aside: "id" and "name" are generally unhelpful identifiers.