Sql-server – Eliminate key lookup in execution plan

execution-planperformancequery-performancesql serversql-server-2012

I have the following query:

DECLARE @p__linq__0 UNIQUEIDENTIFIER

SET @p__linq__0 = '... some guid ...'

SELECT TOP 1
    [EventId] AS [EventId],       
    [DateCreated] AS [DateCreated],       
    [LocationId] AS [LocationId],       
    [SourceName] AS [SourceName],       
    [SourceState] AS [SourceState],       
    [Priority] AS [Priority],       
    [EventDescription] AS [EventDescription],       
    [FirstTrigger] AS [FirstTrigger]
FROM [dbo].[Watchdog]
WHERE 
    [LocationId] = @p__linq__0
    AND 
    [FirstTrigger] = 1
ORDER BY [DateCreated] DESC

Watchdog table defines 2 indecies:

Clustered index on EventId primary key column
Unclustered index on DateCreated column

This is actual execution plan for the query:

Reading other posts on how to eliminate key lookup I added another non-clustered index which includes all columns from SELECT

CREATE NONCLUSTERED INDEX [LocationId_FirstTrigger] ON [dbo].[Watchdog]
(
    [LocationId] ASC,
    [FirstTrigger] ASC
)
INCLUDE (   [EventId],
    [DateCreated],
    [SourceName],
    [SourceState],
    [Priority],
    [EventDescription]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
GO

However, this didn't help and actual execution plan is the same. If I look at key lookup the output is actually included in newly added non clustered index.

My question is, why it's still doing key lookup instead of index scan/seek ?

UPDATE

Following some suggestions in the comments, I dropped newly created non clustered index & instead recreated non clustered index on DateCreated column including columns from SELECT.

Now execution plan is the following:

Also query execution time dropped from 1+ minute to few seconds (this table has 18+ million rows).

Does this mean key lookup was done due to ORDER BY on non-clustered index ?

Best Answer

My question is, why it's still doing key lookup instead of index scan/seek ?

The query specifies that results should be ordered by DateCreated. Since you already had a nonclustered index on DateCreated, the optimizer decided that the cost of doing key lookups was lower than sorting all of the data by DateCreated.

Does this mean key lookup was done due to ORDER BY on non-clustered index ?

Essentially, yes. It was estimated to be cheaper* to read the data in the required order, and get any additional fields through a key lookup, rather than reading all of the fields from a single index and then sorting it by DateCreated.

You could confirm this by comparing the estimated costs between

the original query (with the original indexes), and
the original query with an index hint

The index hint would be like this on the FROM line:

FROM [dbo].[Watchdog] WITH (INDEX (LocationId_FirstTrigger))

This should produce a plan with no key lookups (since LocationId_FirstTrigger is covering for that query), and a Sort operator. I'd expect the "Estimated Cost" to be higher, thus the other plan was chosen.

* To explain the optimizer's choice here:

The TOP (1) in your query means the optimizer sets a row goal, meaning the plan is geared toward producing one row quickly. The optimizer expects to find one row from the Index Scan matching your LocationId predicate very quickly, since it assumes values are distributed uniformly. This may or may not be true in reality. The cost of one Key Lookup following the Index Scan is pretty small.

The scan + lookup option therefore looks cheaper to the optimizer than finding matches using LocationId_FirstTrigger and sorting. You can turn the row goal logic off for the query as a test by adding an OPTION (QUERYTRACEON 4138) hint. You will likely find the optimizer then chooses the LocationId_FirstTrigger index without an index hint.

Still, the best alternative is to modify your index as Mikael Eriksson suggests.

Related Solutions

Sql-server – Key Lookup and Full-text index

Key lookup operations can be avoided by the use of a covering index. Full-text-indexes cannot be "included" in a covering index, however, including the TitleTable in the covering index is still useful since SQL Server can find all the details it needs for the query, aside from the full-text-query results, by seeking the covering index.

I've created a simple test-bed to show this in action.

First, we'll create an empty database for our test, since we cannot create full-text catalogs in tempdb:

USE master;
IF EXISTS (SELECT 1 FROM sys.databases d WHERE d.name = N'FullTextTest') 
BEGIN
    ALTER DATABASE FullTextTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE FullTextTest;
END
CREATE DATABASE FullTextTest;
ALTER DATABASE FullTextTest SET RECOVERY SIMPLE;
BACKUP DATABASE FullTextTest TO DISK = 'NUL:';
GO

Here, we'll create a mock-up of your table. You say in the question that all mentioned columns have an associated non-clustered index, so we'll define those too:

USE FullTextTest;
GO

CREATE FULLTEXT CATALOG ftc
WITH ACCENT_SENSITIVITY = OFF
AS DEFAULT
AUTHORIZATION dbo;

CREATE TABLE dbo.TitleTable
(
    PK_ID int NOT NULL
        CONSTRAINT PK_TitleTable
        PRIMARY KEY CLUSTERED
        IDENTITY(1,1)
    , FKID1 int NOT NULL
    , FKID2 int NOT NULL
    , Title nvarchar(100) NOT NULL
);

CREATE NONCLUSTERED INDEX IX_TitleTable_FKID1
ON dbo.TitleTable (FKID1);

CREATE NONCLUSTERED INDEX IX_TitleTable_FKID2
ON dbo.TitleTable (FKID2);

CREATE NONCLUSTERED INDEX IX_TitleTable_Title
ON dbo.TitleTable (Title);

Here's the full-text index:

CREATE FULLTEXT INDEX ON dbo.TitleTable(Title) 
KEY INDEX PK_TitleTable 
ON ftc 
WITH (CHANGE_TRACKING = AUTO, STOPLIST SYSTEM);

I've got a database with around 47,000 words in it which I'll use to fill the dbo.TitleTable table:

INSERT INTO dbo.TitleTable (FKID1, FKID2, Title)
SELECT wl.WordRow, wl.WordRow, wl.Word
FROM WordsDB.dbo.WordList wl;

Here's the query from your question:

DECLARE @ID_PARAM1 int;
DECLARE @ID_PARAM2 int;
DECLARE @SINGLE_WORD_PARAM nvarchar(100);
SET @ID_PARAM1 = 46777;
SET @ID_PARAM2 = 46777
SET @SINGLE_WORD_PARAM = N'"' + 'it' + N'"';
SELECT T2.Title 
FROM TitleTable T1
    INNER JOIN TitleTable T2 ON T2.FKID1 = T1.FKID1
WHERE T1.FKID2 = @ID_PARAM1 
    AND T2.FKID2 = @ID_PARAM2 
    AND CONTAINS(T1.Title, @SINGLE_WORD_PARAM);

At this point, if we run the query, we see the following plan:

As expected, there is an index seek on the IX_TitleTable_FKID2 non-clustered index, with an associated key-lookup against the table itself for the Title column.

If we add a compound index on both FKID2 and FKID1, we'd expect a different plan, which is what we get:

CREATE INDEX IX_TitleTable_FTS_Cover_NoInclude
ON dbo.TitleTable (FKID2, FKID1);

However, the key lookup for the Title column is still there. What if we add an INCLUDE clause to our index above?

CREATE INDEX IX_TitleTable_FTS_Cover
ON dbo.TitleTable (FKID2, FKID1)
INCLUDE (Title);

Success! No key-lookup operation required. The cost of the query has also dropped from 0.016 to 0.013, so that's a win.

Sql-server – Key lookup partition pruning

Only rows in 7 partitions qualify the predicate(s) so there is no need to ever lookup rows in the other 3 partitions.

The query plan may choose to scan on B and probe (lookup) on A.

A Key Lookup is always a seek into the clustered index. There is no partition elimination in your original example by the way - the query processor is just reporting that it touched 7 distinct partitions (3..9) while executing the query. None of the lookups accessed other partitions.

Best Answer

Related Solutions

Sql-server – Key Lookup and Full-text index

Sql-server – Key lookup partition pruning

Related Question