Sql-server – How to get SQL to do the sensible thing without resorting to a query hint

indexperformancequery-performancesql serversql-server-2008

I have two tables, Items and Grouping. Items has about 7 million rows whereas Grouping only has 300,000.

Of the two conditions on my query, the 'human obvious' thing is to evaluate G.StatusID IN (1,2) first. It seems obvious to do this first because:

  • only 10% of rows fulfil the criteria about StatusID (I have checked, and SQL's statistics know this)
  • Grouping has relatively few rows
  • there is an index on StatusID

Those things combined mean that if the G.StatusID IN (1,2) condition is looked at, far fewer rows from Items have to be examined.

Select
    GroupingID,
    Sum (Amount) FullAmount
From
    Grouping G 
Join
    Items I
        -- With (INDEX(IX_Items_GroupingID))
        On G.GroupingID = I.GroupingID
Where
    I.ThingID Is Null
And
    G.StatusID IN (1,2)
Group By
    G.GroupingID

However SQL Server refuses to do this. It is reading the entirety of Items into memory, and suggests that I implement an index on ThingID. Implementing such an index does speed everything up, but I don't see why I should need to.

If I uncomment the index hint in the code above, SQL Server does as I want it to.

I'd prefer not to put the index hint into production. What is causing SQL Server to pick a slow plan?

(For background, around 25% of Items fulfil I.ThingID Is Null.)

Slow plan on Paste The Plan

Fast plan on Paste The Plan

Amount is a column from Items

Best Answer

It looks like you need to edit your index on Grouping_ID to include ThingID and Amount as per the comments, this will eliminate the Key Lookup in the second plan (76%).