Sql-server – Covering Index Question

indexindex-tuningperformancequery-performancesql servert-sql

I have the following query

Select  Pt.PRODID, PT.INVENTREFID,  Inv.ItemName, PT.ItemID, Configid,  Pt.QTYSCHED,
PT.DLVDATE,  Pt.CREATEDDATETIME, pt.SCHEDEND,
 CASE Left(PT.INVENTREFID, 3)
     WHEN 'SJB' THEN ST.SALESNAME
     WHEN 'WJB' THEN 'Sub - Assembly'
     ELSE 'Stock'
  END as CustomerName
from Dynamicsv5Realtime.dbo.PRODTABLE PT
Join Dynamicsv5Realtime.dbo.Inventdim ID
On PT.InventdimId = ID.InventdimID
and PT.Dataareaid = ID.dataareaid
Join Dynamicsv5Realtime.dbo.INVENTTABLE Inv
On Inv.itemid = PT.ItemId
Left Join Dynamicsv5Realtime.dbo.SALESTABLE ST
ON ST.SalesId = PT.INVENTREFID
and ST.Dataareaid = PT.dataareaid
where pt.PRODSTATUS in(2,3,4)
and PT.DATAAREAID = 'AJB'
Order by 7 

when I was looking to improve it I came up with the following index

CREATE NONCLUSTERED INDEX [INDEX_2]
ON [dbo].[PRODTABLE] ([INVENTREFID],[DATAAREAID],[INVENTDIMID],[PRODSTATUS])
INCLUDE ([PRODID],[ItemID],[QTYSCHED],[DLVDATE],[CREATEDDATETIME],[SCHEDEND])

However this was not as efficient as the following which the missing index DMV came up with

CREATE NONCLUSTERED INDEX [INDEX_1]
ON [dbo].[PRODTABLE] ([DATAAREAID],[PRODSTATUS])
INCLUDE ([ITEMID],[QTYSCHED],[DLVDATE],[SCHEDEND],[INVENTREFID],[PRODID],[INVENTDIMID],    [CREATEDDATETIME]) 

What I am wondering is when is it more efficient to put columns in the index itself rather than in the Include part of the index?

May thanks

Best Answer

Very generally and non-technically, you should consider using the column as part of the index key if you are going to use that column in the WHERE clause (or otherwise filter/seek), and INCLUDE if it is merely being used to avoid a lookup (I like to call this "coming along for the ride"). Of course there are always exceptions:

  • you can't always include columns in the key due to size;
  • it may not matter due to cardinality; or,
  • it might not affect the plan due to where you would have to put it in the key to allow the index to continue satisfying other query semantics.

There are also cases where an INCLUDEd column can satisfy a filter just fine, and plenty of cases where changing the index might help this specific query but it might wreak havoc on the rest of your workload. You should always test index changes against an entire workload representing as much of a full business cycle as possible, rather than rely on the missing indexes DMV or your own analysis which, it seems, is focused on a single query.

I'm sure Paul White will come along with a much more elaborate and technically accurate response.