Sql-server – Sql Server fails to use index on simple bijection

execution-plansql serversql-server-2016

This is another query optimizer conundrum.

Maybe I'm just over-estimating query optimizers, or maybe I'm missing something – so I'm putting it out there.

I have a simple table

CREATE TABLE [dbo].[MyEntities](
  [Id] [uniqueidentifier] NOT NULL,
  [Number] [int] NOT NULL,
  CONSTRAINT [PK_dbo.MyEntities] PRIMARY KEY CLUSTERED ([Id])
)

CREATE NONCLUSTERED INDEX [IX_Number] ON [dbo].[MyEntities] ([Number])

with an index and some thousand rows in there, Number being evenly distributed in the values 0, 1 and 2.

Now this query:

SELECT * FROM
    (SELECT
        [Extent1].[Number] AS [Number],
        CASE
        WHEN (0 = [Extent1].[Number]) THEN 'one'
        WHEN (1 = [Extent1].[Number]) THEN 'two'
        WHEN (2 = [Extent1].[Number]) THEN 'three'
        ELSE '?'
        END AS [Name]
        FROM [dbo].[MyEntities] AS [Extent1]
        ) P
WHERE P.Number = 0;

does an index seek on IX_Number as one would expect.

If the where clause is

WHERE P.Name = 'one';

however, it becomes a scan.

The case-clause is obviously a bijection, so in theory an optimization should be possible to deduct the first query plan from the second query.

It's also not purely academic: The query is inspired by translating enum values to their respective friendly names.

I'd like to hear from someone who know what can be expected from query optimizers (and specifically the one in Sql Server): Am I simply expecting too much?

I'm asking as I had cases before where some slight variation of a query would make an optimization suddenly come to light.

I'm using Sql Server 2016 Developer Edition.

Best Answer

Am I simply expecting too much?

Yes. At least in current versions of the product.

SQL Server will not pick apart the CASE statement and reverse engineer it to discover that if the result of the computed column is 'one' then [Extent1].[Number] must be 0.

You need to make sure that you write your predicates to be sargable. Which almost always involves it being in the form. basetable_column_name comparison_operator expression.

Even minor deviations break sargability.

WHERE P.Number + 0 = 0;

would not use an index seek either even though it is even more straightforward to simplify than the CASE expression.

If you want to search on a string name and get a seek on number you would need a mapping table with the names and numbers and join onto it in the query, then the plan might have a seek on the mapping table followed by a correlated seek on [dbo].[MyEntities] with the number returned from the first seek.