I have a Records
table with over 100 columns and very many rows, and a nonclustered index on 5 fields based on my access paths:
CREATE NONCLUSTERED INDEX [IX_Records_CustomerID]
ON [dbo].[Records] (
[CustomerID] ASC, -- int
[IsInvalid] ASC, -- int
[IsProcessed] ASC, -- bit
[IsRejected] ASC, -- bit
[RecordName] ASC, -- varchar(12)
;
The 5 fields do not include the primary key RecordID
, which is the column in the clustered index.
Here is my poorly performing query:
SELECT * FROM Records WHERE CustomerID IN (181, 283, 505)
The execution plan shows that it performs a Clustered Index Scan, which I understand is because I'm selecting columns that are not included in the index. In Management Studio, I change the query to:
SELECT CustomerID, IsInvalid, IsProcessed, IsRejected, RecordName FROM Records
WHERE CustomerID IN (181, 283, 505)
And the execution plan shows an Index Seek, and the query execution time drops from 44 seconds to 2 seconds. However, I lack the liberty in the application to replace the *
with only the columns I need and have included in my index.
Is there any way around the clustered index scan when I'm locked into SELECT *
?
Best Answer
If you need columns in the output that aren't covered by the index, the optimizer has to make a choice:
Which way it will choose depends on a variety of things, including how narrow the index is, how many rows match the predicate, etc. You can force a seek with the
FORCESEEK
hint, but I suspect it will end up performing the same or worse than the scan SQL Server has chosen in your case.Some options:
Create a view that selects only the columns you need:
Then you can change the app to
SELECT *
from this view. Or you can get even more creative and rename the original table, and change the name of this view to what the name of the table used to be. Breaking change, obviously; proceed with caution.Add all of the other columns to the key or
INCLUDE
list for the index. If these are hard-coded values and always the ones used, you may consider a filtered index.