I have two tables, Items
and Grouping
. Items
has about 7 million rows whereas Grouping
only has 300,000.
Of the two conditions on my query, the 'human obvious' thing is to evaluate G.StatusID IN (1,2)
first. It seems obvious to do this first because:
- only 10% of rows fulfil the criteria about
StatusID
(I have checked, and SQL's statistics know this) Grouping
has relatively few rows- there is an index on
StatusID
Those things combined mean that if the G.StatusID IN (1,2)
condition is looked at, far fewer rows from Items
have to be examined.
Select
GroupingID,
Sum (Amount) FullAmount
From
Grouping G
Join
Items I
-- With (INDEX(IX_Items_GroupingID))
On G.GroupingID = I.GroupingID
Where
I.ThingID Is Null
And
G.StatusID IN (1,2)
Group By
G.GroupingID
However SQL Server refuses to do this. It is reading the entirety of Items
into memory, and suggests that I implement an index on ThingID. Implementing such an index does speed everything up, but I don't see why I should need to.
If I uncomment the index hint in the code above, SQL Server does as I want it to.
I'd prefer not to put the index hint into production. What is causing SQL Server to pick a slow plan?
(For background, around 25% of Items
fulfil I.ThingID Is Null
.)
Amount
is a column from Items
Best Answer
It looks like you need to edit your index on
Grouping_ID
to includeThingID
andAmount
as per the comments, this will eliminate the Key Lookup in the second plan (76%).