PostgreSQL Query – Differentiate Between No Child Rows and No Parent Row in One SELECT

postgresql

Say I have a table C that references rows from tables A and B:

id, a_id, b_id, ...

and a simple query:

SELECT * FROM C WHERE a_id=X AND b_id=Y

I would like to differentiate between the following cases:

  • No row exists in A where id = X
  • No row exists in B where id = Y
  • Both such rows in A and B exist, but no rows in C exist where a_id = X and b_id = Y

The above query will return empty result in all those cases.

In case of one parent table I could do a LEFT JOIN like:

SELECT * FROM A LEFT JOIN C ON a.id = c.a_id WHERE c.a_id = X

and then check if the result is empty (no row in A exists), has one row with NULL c.id (row in A exists, but no rows in C exist) or 1+ rows with non-NULL c.id (row in A exists and at least one row in C exists). A bit messy but it works, but I was wondering if there is a better way of doing this, especially if there is more than one parent table?

Best Answer

I think I managed to get a satisfactory solution using the following two features:

  • Subselect bound to a column, which allows me to check if a row exists and (importantly) get a NULL value otherwise (e.g. SELECT (SELECT id FROM a WHERE id = 1) as a_id))

  • Common Table Expressions

Initial data:

CREATE TABLE people 
(
  id integer not null primary key, 
  name text not null
);

CREATE TABLE thing_types 
(
  id integer not null primary key, 
  name text not null
);

CREATE TABLE things
(
  id integer not null primary key, 
  person_id integer not null references people(id), 
  thing_type_id integer not null references thing_types(id), 
  name text not null
);

INSERT INTO people VALUES (1, 'Bill');
INSERT INTO thing_types VALUES (1, 'game');

INSERT INTO things VALUES (1, 1, 1, 'Duke Nukem');
INSERT INTO things VALUES (2, 1, 1, 'Warcraft 2');

And the query:

WITH v AS (
  SELECT (SELECT id FROM people WHERE id=<person_id_param>) AS person_id, 
         (SELECT id FROM thing_types WHERE id=<thing_type_param>) AS thing_type_id
)
SELECT v.person_id, v.thing_type_id, things.name 
FROM 
  v LEFT JOIN things 
    ON v.person_id = things.person_id AND v.thing_type_id = things.thing_type_id

This query will always return at least one row. In case if both parent table ids are valid and there are some records, I get

person_id  thing_type_id  name
-------------------------------------
        1              1   Duke Nukem
        1              1   Warcraft 2

If either person_id or thing_type_id are invalid, I get one row where name is NULL and either person_id or thing_type_id is NULL:

person_id  thing_type_id  name
-------------------------------------
     NULL              1         NULL

If both person_id and thing_type_id are valid but there are no records in things, I get one row where both person_id and thing_type_id are not NULL, but the name is NULL:

person_id  thing_type_id  name
-------------------------------------
        1              1         NULL

Since I have a NOT NULL constraint on things.name, I know that this case can only mean that there are no matching records in things.