SQL Server Performance – Clustered Table Scan Due to SELECT *

indexnonclustered-indexperformancequery-performancesql server

I have a Records table with over 100 columns and very many rows, and a nonclustered index on 5 fields based on my access paths:

CREATE NONCLUSTERED INDEX [IX_Records_CustomerID]
ON [dbo].[Records] (
    [CustomerID] ASC, -- int
    [IsInvalid] ASC, -- int
    [IsProcessed] ASC, -- bit
    [IsRejected] ASC, -- bit
    [RecordName] ASC, -- varchar(12)
;

The 5 fields do not include the primary key RecordID, which is the column in the clustered index.

Here is my poorly performing query:

SELECT * FROM Records WHERE CustomerID IN (181, 283, 505)

The execution plan shows that it performs a Clustered Index Scan, which I understand is because I'm selecting columns that are not included in the index. In Management Studio, I change the query to:

SELECT CustomerID, IsInvalid, IsProcessed, IsRejected, RecordName FROM Records 
    WHERE CustomerID IN (181, 283, 505)

And the execution plan shows an Index Seek, and the query execution time drops from 44 seconds to 2 seconds. However, I lack the liberty in the application to replace the * with only the columns I need and have included in my index.

Is there any way around the clustered index scan when I'm locked into SELECT *?

Best Answer

If you need columns in the output that aren't covered by the index, the optimizer has to make a choice:

  1. Perform a table / clustered index scan (therefore all columns are there)
  2. Perform a seek, then perform lookups to retrieve the columns not covered

Which way it will choose depends on a variety of things, including how narrow the index is, how many rows match the predicate, etc. You can force a seek with the FORCESEEK hint, but I suspect it will end up performing the same or worse than the scan SQL Server has chosen in your case.

Some options:

  1. Change the app to run a proper query. I listed this first for a reason.
  2. Create a view that selects only the columns you need:

    CREATE VIEW dbo.myview
    WITH SCHEMABINDING
    AS
      SELECT col1, col2, col3 FROM dbo.tablename;
    

    Then you can change the app to SELECT * from this view. Or you can get even more creative and rename the original table, and change the name of this view to what the name of the table used to be. Breaking change, obviously; proceed with caution.

  3. Add all of the other columns to the key or INCLUDE list for the index. If these are hard-coded values and always the ones used, you may consider a filtered index.