Sql-server – Covering Index Changes Execution Plan but is not used

execution-planindexperformancequery-performancesql serversql-server-2016

I have the following occasionally slow running query:

SELECT C.CustomerID 
FROM dbo.Customers C WITH (NOLOCK)
WHERE C.Forename = @Forename
    AND C.Surname = @Surname
OPTION (RECOMPILE)

CustomerID is the Primary Key on the Customers table. The Customers table also has the following two non-clustered indexes:

CREATE NONCLUSTERED INDEX idx_Forename ON Customers (Forename ASC)
CREATE NONCLUSTERED INDEX idx_Surname ON Customers (Surname ASC)

When I run the query with both a surname and forename entered the query optimiser uses the index 'idx_Surname' as in the following execution plan:

Execution Plan without covering index

This query takes over two minutes to complete for this particular search and finds no results. For the values entered @Forename has no matches in the Customers table while @Surname matches 31,162 records. When I only search by the @surname the 31,162 records return in under a second with the following plan:

Execution Plan surname only search

In an attempt to optimise the query for searches containing both Forename and Surname I added the following covering index:

CREATE NONCLUSTERED INDEX idx_Surname_Covering ON dbo.Customers (Surname) INCLUDE (Forename)

The query with both Forename and Surname then returns in less than one second. However, the covering index is not used in the actual execution plan:

Execution Plan with covering index

So,

  1. Is the covering index required or is there a better way to improve the performance and
  2. Why does the additional covering index cause the change of index in the actual execution plan from idx_Forename to idx_Surname?

p.s. the query above is an isolated example, when in use either surname or forename or both may be searched for and the Customers table also includes other searchable columns with their own indexes. This detail was not considered relevant to the question so I have not included it.

Best Answer

1) Is the covering index required or is there a better way to improve the performance

Best index

The best index would be the most covering, selective index for the queries accessing the table.

Take for example in your table, you have 50000 rows where the firstname = John , but only one where the last name = 'McClane', Should you create the index with John as the First key value or McClane?

Answer:

It depends... If you are always searching for John Mcclane, then its an open and shut case of indexing the lastname first. But what if there are also queries searching for Constanthin Smith? You could have over 5000 Smiths, but only five Constanthin's.

As a result, it depends on your queries and what you are seeking on, how much they are executed, ....

If your queries are going to always seek on both the firstname and lastname, then it is the simple case of picking the more selective one as the first key column. Keeping in mind that read performance improving should stay bigger than write performance declining.

Ofcourse, nobody restricts you to create two indexes, one with (firstname,lastname )and one with (lastname,firstname).

(Your update / insert / delete statements might).

Not considering filtered indexes and whatnot, the best index for your example would be:

CREATE NONCLUSTERED INDEX idx_Forename_Surname ON dbo.Customers (Forename,Surname)

2) Why does the additional covering index cause the change of index in the actual execution plan from idx_Forename to idx_Surname?

I don't think that this was just because of the index, but because of the statistics created as a result of the index creation.

Even though these stats are the same as the one in idx_Surname, my guess is that they are with a bigger sample rate (100), as they are created with 'fullscan'.

If an auto update stats occured on the statistics created by the index idx_Surname, they might have had a smaller sample rate, resulting in bad estimates (E.g. 1% Sample rate).

You could try removing the idx_Surname_Covering index and its statisics and updating the stats on dbo.Customers with 100% sample rate (fullscan) to test this theory.

UPDATE STATISTICS dbo.Customers WITH FULLSCAN

Which hopefully changes your plan to use the better seek.

If this is why your query changed, and updating the stats with fullscan on maintenance windows is not a viable option, you could change the sample rate