SQL Server Performance – Why Simple SELECTs Are Faster Than LEFT JOINs

execution-planperformancequery-performancesql serversql-server-2012

I need to find a single value from a table A containing three foreign keys to two other tables B and C.

For the sake of experiment, I tested two ways to query the value:

Multiple queries:

declare @start int = (select top 1 [Id] from [B] where [Day] = '2015-01-01')
declare @end int = (select top 1 [Id] from [B] where [Day] = '2017-06-14')
declare @category int = (select top 1 [Id] from [C] where [Title] = 'Hello, World!')

select top 1 [Name]
from [A]
where [StartId] = @start
    and [EndId] = @end
    and [CategoryId] = @category
    and [Day] = '2016-05-27'

Single query:

select top 1 [Name]
from [A]
left join [B] as [BStart] on [BStart].[Id] = [A].[StartId]
left join [B] as [BEnd] on [BEnd].[Id] = [A].[EndId]
left join [C] on [C].[Id] = [A].[CategoryId]
where [BStart].[Day] = '2015-01-01'
    and [BEnd].[Day] = '2017-06-14'
    and [C].[Title] = 'Hello, World!'
    and [A].[Day] = '2016-05-27'

I was surprised that the execution plan indicates that the single query is more expensive than multiple queries. When doing all five selects together, the one with left joins indicates 53%. The other four queries indicate 12% each.

Those are the execution plans:

declare @start int = (select top 1 [Id] from [B] where [Day] = '2015-01-01')`

enter image description here

declare @end int = (select top 1 [Id] from [B] where [Day] = '2017-06-14')

(Same as below)

declare @category int = (select top 1 [Id] from [C] where [Title] = 'Hello, World!')

enter image description here

select top 1 [Name]
from [A]
where [StartId] = @start
    and [EndId] = @end
    and [CategoryId] = @category
    and [Day] = '2016-05-27'

enter image description here

select top 1 [Name]
from [A]
left join [B] as [BStart] on [BStart].[Id] = [A].[StartId]
left join [B] as [BEnd] on [BEnd].[Id] = [A].[EndId]
left join [C] on [C].[Id] = [A].[CategoryId]
where [BStart].[Day] = '2015-01-01'
    and [BEnd].[Day] = '2017-06-14'
    and [C].[Title] = 'Hello, World!'
    and [A].[Day] = '2016-05-27'

enter image description here

Why is the single query with left joins slower than the first approach?

Best Answer

Your execution plan for the "individual queries" shows that pre-calculating the StartID, Category etc allows an index to be used efficiently on A, "seeking" straight to the record(s) you want (you have a Non-Clustered Index Seek in your query plan), having identified the given Category etc to search for.

The "single-query" with JOINs, on the other hand, needs to carry out the join between table A-B-C "on the fly" before returning the "top 1" result matching your criteria. This join will involve all the records of those tables where they match to the other related table (full clustered index scan).

If you are frequently querying by Day and Title like this, it looks like an index is missing (or statistics out of date) that allow table B and C to be searched by those criteria.

By the way, Top 1 will only give a deterministic result if you "order by" some criteria (or if your search statement is unique) - otherwise you will get the 1st row the query comes to, which may or may not be consistent between occasions when you run it.