PostgreSQL Join – How Additional Natural Joins Reduce Row Count

postgresql

This is real results from my database. I spend some time trying to reproduce this behavior with an SSCCE but failed. How is the last of the below results possible?

SELECT COUNT(*) FROM rr.resource    a
--- 15771

SELECT COUNT(*) FROM rr.resource    a
NATURAL JOIN rr.interface   b
--- 41419

SELECT COUNT(*) 
FROM rr.resource    a
NATURAL JOIN rr.interface   b
NATURAL JOIN rr.interface   c
--- 0

The zero (0) row count in the last query is consistent and repeatable every time.
In some examples I created, adding a second natural join with the same table twice yields the same number of rows (as I was expecting). How is then the above result possible?

Best Answer

Basically, NULL is causing this, because NULL<>NULL. One of the columns in your self-joined table will be all NULLs.

Here's a little test case that shows why this can happen. Naughty NULL equality and the way NATURAL JOIN works, picking column names to join on for you.

Setup:

create table one ( a integer, b integer );

CREATE TABLE two ( A INTEGER , c integer);

insert into one values (1,1);

insert into two values (1,1);

insert into two values (1,null);

insert into two values(2,NULL);

Queries:

SELECT COUNT(*)
FROM ONE
NATURAL JOIN TWO tttt;

SELECT COUNT(*)
FROM ONE
NATURAL JOIN TWO tttt
NATURAL JOIN TWO ttttt;

The first query will give 2, the second query 1.

Your problem will just be a more advanced version of this.

Use JOIN .... USING with the columns of your own choice to workaround this.