Sql-server – Understanding Variations on Multiple Joins

join;postgresqlsql serversyntax

I have two versions of a multiple join which produce identical results. The five tables are:

customers ← sales ← saleitems → paintings → artists

The arrows (hopefully) show the relationship between the tables.

Each table has a primary key called id, and the inner tables have a foreign key to another table called [othertable]id.

The first version is a classic join with the table listed in the above order

SELECT
    c.id, c.givenname, c.familyname,
    a.givenname, a.familyname
FROM
    customers c
        JOIN sales s ON c.id=s.customerid
            JOIN saleitems si ON s.id=si.saleid
                JOIN paintings p ON si.paintingid=p.id
                    JOIN artists a ON p.artistid=a.id

;

The second version jas all the JOIN clauses first, and the the ON clauses, in the reverse order.

SELECT
    c.id, c.givenname, c.familyname,
    a.givenname, a.familyname
FROM
    customers c
        JOIN sales s 
            JOIN saleitems si
                JOIN paintings p
                    JOIN artists a
                    ON p.artistid=a.id
                ON si.paintingid=p.id
            ON s.id=si.saleid
        ON c.id=s.customerid

;

OK, so it works, but can any one explain how the second version works? Why must the order of the ON clauses be the reverse of the JOIN clauses? Is it possible to randomly order the JOIN or ON clauses?

I have tested this in Microsoft SQL as well as with PostgreSQL.

Best Answer

You can't randomize them, the order is important. Maybe you can see it clearer if we add a few parenthesis between the joins.

SELECT
    c.id
FROM
    customers c
        JOIN (sales s JOIN saleitems si ON s.id = si.saleid)
    ON c.id = s.customerid

What this query is doing is joining first sales with saleitems and then their result is being joined with customers. You can't reference a customers column inside the parenthesis join because it's not accessible:

SELECT
    c.id
FROM
    customers c
        JOIN (sales s JOIN saleitems si ON s.id = si.saleid 
            AND c.id = c.customerid) -- What is c.customerid here??

Msg 4104, Level 16, State 1, Line 6 The multi-part identifier "s.customerid" could not be bound.

If you ask me, I'd stick with the first way of writing the query, it's way clearer for most programmers that way.