The query
I had a specific query I wanted to optimize:
SELECT * /*12 columns*/
FROM [dbo].[EnterpriseGroup]
WHERE
(EnterpriseGroup.ChildId = 123 OR EnterpriseGroup.FatherId = 234)
AND StatusCd >= 2
There is also already an index on FatherId
, but not on the ChildId
. The primary key is among the 12 selected columns, but none of those used in the WHERE clause here.
The use
This is a simple query but it's run very, very often during daily work. The table is also small, around 8000 rows.
The query is used to find groups of enterprises. There are about 2 Millions of enterprise entries, so only less than 0.5% have a matching group row and thus most of the time, no group will be found.
The recommendation
When using SSMS and inspecting the "Actual Execution Plan" it gives this plan:
The predicate shown is actually the WHERE clause.
And, it recommends to create an index, which is basically indexing on the WHERE clause and adding all the queried columns directly into the index. Seems not very clever to me but maybe this is what this questions is all about:
/*
Missing Index Details from SQLQuery6.sql .....
The Query Processor estimates that implementing the following index could improve the query cost by 68.9052%.
*/
/*
USE [...]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[EnterpriseGroup] ([StatusCd])
INCLUDE (....all the 12 queried columns......)
GO
*/
The result
After I create the recommended index I get the following plan:
It's not used at all! (And the Table Scan hover info is exactly the same, still having a copy of the WHERE Clause as predicate)
The Question
Why does SQL Server not use an existing index that the SSMS adviced to create?
If not, then what is the significance of a missing index recommendation in the execution plan of SSMS (SQL Server Management Studio) against a Microsoft SQL server?
The notes
Note: I am no DBA, but a software developer. I have read a bit into this, including: https://www.brentozar.com/archive/2013/07/dude-who-stole-my-missing-index-recommendation/ but I did not clarify to me.
Note: In case it matters:
– SQL Server Version 11.0.7493.4, running on Windows NT 6.3.
– Microsoft SQL Server Management Studio is Version 11.0.7493.4
Best Answer
Assuming you
WHERE
condition still utilizesChildId
,FatherId
andStatusCd
. It is possible that from a statistics perspective, that eitherChildId
orFatherId
is more selective. TakingChildId
andFatherId
out of theWHERE
clause should result in that new index being used, sinceStatusCd
is theindexed column
in theindex
.If you hover over the Table Scan section you should see something like the below image:
It is possible that even though it recommended creating that index, that it is still querying by either
ChildId
orFatherId
.It would do this if
ChildId
orFatherId
is more selective. Lets sayStatusCd >= 2
returns 6,000 of the 8,000 rows. But eitherChildId = 123
orFatherId = 234
only has 1 row. Then doing a table scan on that one column, and applying the rest of the conditions after the fact is a more efficient query plan (theoretically speaking) than returning all 6,000 rows fromStatusCd >= 2
and trying to apply theChildId
orFatherId
conditions after.This was something I learned from a question I asked a little bit ago. The guy who answered it had a really great way of explaining what I tried to say here. Does the Query Optimizer Prefer to Query on Constants before Columns?
Hopefully that helps.