I have a query that looks like this:
SELECT *
FROM TBLA A
LEFT JOIN TBLB B ON A.Col1 = B.Col2
RIGHT JOIN TBLC C ON B.Col3 = C.Col4
JOIN TBLD D ON C.Col5 = D.Col6
In which order will the joins be resolved? I'm most interested in SQL Server, and will mark an explanation for it as the answer, but am equally interested in the ANSI/ISO standard and how it works in the various RDBMS.
The reason for this question was to figure out why the results differed from this query
SELECT *
FROM TBLA A
CROSS JOIN TBLC C
LEFT JOIN TBLB B ON A.Col1 = B.Col2 AND B.Col3 = C.Col4
JOIN TBLD D ON C.Col5 = D.Col6
Best Answer
Logically the joins are resolved in the order of the
ON
clauses from left to right.The output of each join is a virtual table that goes into the next join.
So for the query in your question the virtual table result of
A LJ B
is then right joined ontoC
. The join condition ofB.Col3 = C.Col4
will lose any null extended rows preserved by the original left join effectively turning the first join back into a inner join and the resulting virtual table (which preserves all rows fromC
) is then inner joined ontoD
.So your initial query can be simplified as
Which is effectively the same as
(A IJ B) ROJ (C IJ D)
The order of the
ON
clauses is not necessarily the same as the order the tables appear in the query. This could also be rewritten as(C IJ D) LOJ (A IJ B)
The position of the on clauses means that the outer join is carried out between the two virtual tables resulting from
(C IJ D)
and(A IJ B)
rather than just on a single table.In your second query conceptually the virtual table
A x C
is left joined onto B preserving the entire cartesian product then the result of that is joined ontoD
with the predicateC.Col5 = D.Col6
. This eliminates any rows from the final result that do not inner join betweenC
andD
meaning it is equivalent to