Postgresql – Understanding JOIN Syntax

join;postgresql

Given:

postgres=# \d foo
                Table "public.foo"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | text    |           | not null | 
Indexes:
    "foo_pkey" PRIMARY KEY, btree (a)

postgres=# \d bar
                Table "public.bar"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | text    |           | not null | 
Indexes:
    "bar_pkey" PRIMARY KEY, btree (a)

postgres=# select * from foo;
 a |  b  
---+-----
 1 | one
(1 row)

postgres=# select * from bar;
 a |  b  
---+-----
 2 | two
(1 row)

I then join'd using the following JOIN syntax:

postgres=# select * from foo, bar;
 a |  b  | a |  b  
---+-----+---+-----
 1 | one | 2 | two
(1 row)

Then, I compared it to full outer join:

postgres=# select * from foo full outer join bar using (a);
 a |  b  |  b  
---+-----+-----
 1 | one | 
 2 |     | two
(2 rows)

and cross join:

postgres=# select * from foo cross join bar;
 a |  b  | a |  b  
---+-----+---+-----
 1 | one | 2 | two
(1 row)

Is it always true that the from a, b, c will produce a cross join?

Best Answer

Like @Akina says in his comment, "," and CROSS JOIN is equivalent:

A, B   <=>    A CROSS JOIN B

This is also known as CARTESIAN PRODUCT. The result will have a cardinality (size) of:

| A, B | = | A | * | B |

We can restrict the result so that it becomes a subset of the CROSS JOIN like:

A, B WHERE <some condition>

The corresponding operation becomes

A INNER JOIN B ON <some condition>

INNER can be left out:

A JOIN B ON <some condition>

OUTER JOIN is a bit more complicated to explain in terms of sets, but let's start with a LEFT OUTER JOIN:

A LEFT OUTER JOIN B ON <some condition>

This means:

A JOIN B ON <some condition>
UNION
A WHERE it does NOT EXISTS a corresponding B

I deliberately left out any information regarding columns in the resultset.

A RIGHT OUTER JOIN B ON <some condition>

is equal to

A JOIN B ON <some condition>
UNION
B WHERE it does NOT EXISTS a corresponding A

Finally FULL OUTER JOIN is a UNION between a LEFT and a RIGHT outer JOIN:

A FULL OUTER JOIN B ON <some condition>

is equal to:

A LEFT OUTER JOIN B ON <some condition>
UNION
A RIGHT OUTER JOIN B ON <some condition>

To get it completely right we would need to extend the explanation with the columns involved.

I use to think that the "," notation was short and sweet, but after spending years and years of tuning, modifying and extending existing SQL I now prefer the ANSI JOINs over the "," ones. As shown in the comments by ypercubeᵀᴹ fiddle strange things can happen if you have a "," join and the extend it with an ANSI join. I also find it much easier to convince myself that I have gotten the join conditions correct when they are closely located with the join, instead of the WHERE clause:

SELECT ... 
FROM A
JOIN B
    ON ...
   AND ...
JOIN C
    ON ...
   AND ...

vs

SELECT ... 
FROM A, B, C
WHERE ...
  AND ...
  AND ...
  AND ...