I have the following query:
DECLARE @p__linq__0 UNIQUEIDENTIFIER
SET @p__linq__0 = '... some guid ...'
SELECT TOP 1
[EventId] AS [EventId],
[DateCreated] AS [DateCreated],
[LocationId] AS [LocationId],
[SourceName] AS [SourceName],
[SourceState] AS [SourceState],
[Priority] AS [Priority],
[EventDescription] AS [EventDescription],
[FirstTrigger] AS [FirstTrigger]
FROM [dbo].[Watchdog]
WHERE
[LocationId] = @p__linq__0
AND
[FirstTrigger] = 1
ORDER BY [DateCreated] DESC
Watchdog
table defines 2 indecies:
- Clustered index on
EventId
primary key column - Unclustered index on
DateCreated
column
This is actual execution plan for the query:
Reading other posts on how to eliminate key lookup I added another non-clustered index which includes all columns from SELECT
CREATE NONCLUSTERED INDEX [LocationId_FirstTrigger] ON [dbo].[Watchdog]
(
[LocationId] ASC,
[FirstTrigger] ASC
)
INCLUDE ( [EventId],
[DateCreated],
[SourceName],
[SourceState],
[Priority],
[EventDescription]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
GO
However, this didn't help and actual execution plan is the same. If I look at key lookup the output is actually included in newly added non clustered index.
My question is, why it's still doing key lookup
instead of index scan/seek ?
UPDATE
Following some suggestions in the comments, I dropped newly created non clustered index & instead recreated non clustered index on DateCreated
column including columns from SELECT
.
Now execution plan is the following:
Also query execution time dropped from 1+ minute to few seconds (this table has 18+ million rows).
Does this mean key lookup was done due to ORDER BY
on non-clustered index ?
Best Answer
The query specifies that results should be ordered by
DateCreated
. Since you already had a nonclustered index onDateCreated
, the optimizer decided that the cost of doing key lookups was lower than sorting all of the data byDateCreated
.Essentially, yes. It was estimated to be cheaper* to read the data in the required order, and get any additional fields through a key lookup, rather than reading all of the fields from a single index and then sorting it by
DateCreated
.You could confirm this by comparing the estimated costs between
The index hint would be like this on the
FROM
line:This should produce a plan with no key lookups (since
LocationId_FirstTrigger
is covering for that query), and aSort
operator. I'd expect the "Estimated Cost" to be higher, thus the other plan was chosen.* To explain the optimizer's choice here:
The
TOP (1)
in your query means the optimizer sets a row goal, meaning the plan is geared toward producing one row quickly. The optimizer expects to find one row from the Index Scan matching yourLocationId
predicate very quickly, since it assumes values are distributed uniformly. This may or may not be true in reality. The cost of one Key Lookup following the Index Scan is pretty small.The scan + lookup option therefore looks cheaper to the optimizer than finding matches using
LocationId_FirstTrigger
and sorting. You can turn the row goal logic off for the query as a test by adding anOPTION (QUERYTRACEON 4138)
hint. You will likely find the optimizer then chooses theLocationId_FirstTrigger
index without an index hint.Still, the best alternative is to modify your index as Mikael Eriksson suggests.