For some irrelevant reasons it's hard for me to get LEFT OUTER JOINS
working with multiple ON
clauses. Since simply moving these clauses to the WHERE
statement results in different result sets, I add an additional OR IS NULL
check.
LEFT OUTER JOIN ON query
SELECT t1.id, t2.ShardKey
FROM Table1 t1
LEFT OUTER JOIN Table2 t2 on t1.table2 = t2.id and t1.ShardKey = t2.ShardKey
WHERE t1.id = @id and t1.ShardKey = @shardkey
LEFT OUTER JOIN WHERE OR IS NULL query
SELECT t1.id, t2.ShardKey
FROM Table1 t1
LEFT OUTER JOIN Table2 t2 on t1.table2 = t2.id
WHERE t1.id = @id and t1.ShardKey = @shardkey
AND (t1.ShardKey = t2.ShardKey OR t2.ShardKey IS NULL)
Simplified definition for both tables:
CREATE TABLE [dbo].[Table1](
[Id] [int] NOT NULL,
[ShardKey] [int] NOT NULL,
[Table2] [int] NULL,
PRIMARY KEY CLUSTERED
(
[Id] ASC,
[ShardKey] ASC
))
CREATE TABLE [dbo].[Table2](
[Id] [int] NOT NULL,
[ShardKey] [int] NOT NULL,
PRIMARY KEY CLUSTERED
(
[Id] ASC,
[ShardKey] ASC
))
These queries seem to give exactly the same query plan in SQL Server Management Studio, but I wonder if there're situations where the one outperforms the other.
Best Answer
The LEFT OUTER JOIN ON version is the standard way to write this logic, so I would highly recommend using this version rather than fiddling around with
OR
for performance reasons.However, as ypercube has helpfully pointed out, the queries are not quite equivalent if the data and sharding methodology means that a
Table1.Id
value can end up in a differentShardKey
than theShardKey
of the correspondingTable1.table2
value. If that happens, then theOR t2.ShardKey IS NULL
portion of the 2nd query will be triggered and retain a row with matchingId
s even if theShardKey
does not match.In general, I suspect that they will typically result in similar query plans. Given two equivalent queries (which yours may or may not be, depending on the sharding methodology), by formulating the query in the standard way you are more likely to be working with the query optimizer rather than forcing it to recognize that the two queries can be expressed in similar ways and apply the appropriate simplifications.
If you want to be even more explicit about applying
@shardkey
to each table in the join, you can do that directly in theLEFT OUTER JOIN
. However, SQL Server will be able to easily recognize that your original query is equivalent to this one.An example of different query plans
On my SQL 2014 instance, the queries have (slightly) different query plans and the query plan for the LEFT OUTER JOIN ON version (top) is superior to the ORM-generated LEFT OUTER JOIN WHERE OR IS NULL version (bottom).
In particular, the seek into
Table2
should be a seek on bothid
andShardKey
, but in the LEFT OUTER JOIN WHERE OR IS NULL version it is a seek only onid
and a subsequentFilter
operator tests all matching rows in order to apply the appropriateShardKey
restriction.This difference may or may not be meaningful in your specific case (especially if
Id
is unique without adding in theShardKey
). However, if your real version actually uses table partitioning byShardKey
(and this just isn't represented in the simple sample query), you will likely be losing the benefit of partition elimination for the seek intoTable2
.