Sql-server – Query using a filtered index but wrong estimated number of rows

filtered-indexindex-tuningsql-server-2016statistics

I have a table with a nonclustered index on a datetime2 field.

On that same table i have a field (char[1]) which is used to logically delete records and can have 2 distinct values : A(active) or D(deleted).

There are 451047 records with the datetime2 field set at NULL but only 7095 are marked as A.

Every query in the application looks for active records only, so, every query that's looking for NULLs in the datetime field was getting really bad estimates and, with that, bad execution plans.

Simple example: Plan without filtered index: look at estimated rows vs actual rows

I decided then to create a filtered non clustered index but estimates were still incorrect: Plan using a filtered index, still getting wrong estimates

It looks like i still get the old estimates, even if the query is using properly the filtered index. Does anyone know the reason of this behaviour?

These are the new statistics of the filtered index: Filtered index statistics

Table definition:

CREATE TABLE [TYDATPRD].[HAND00F](
    [STDRECSTS] [char](1) NULL,
    [HDHAND] [numeric](14, 0) IDENTITY(1,1) NOT NULL,
    [HDCHKINDT] [datetime2](7) NULL,
    --lots of other columns which I don't think are needed

    CONSTRAINT [PK_TYDATPRD_HAND00F] PRIMARY KEY CLUSTERED (
        [HDHAND] ASC
    ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [DATA]
) ON [DATA]

Nonclustered filtered index definition:

CREATE NONCLUSTERED INDEX [IX_HAND00F_HDCHKINDT] ON [TYDATPRD].[HAND00F] (
    [HDCHKINDT] ASC
)
WHERE [STDRECSTS]='A'
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [INDEXES]
GO

Best Answer

This is not intuitive but try adding the char[1] column to the filtered index definition:

CREATE NONCLUSTERED INDEX [IX_HAND00F_HDCHKINDT] 
ON [TYDATPRD].[HAND00F] 
(
    [HDCHKINDT], STDRECSTS
)
WHERE [STDRECSTS] = 'A'
WITH (DROP_EXISTING = ON) ON [INDEXES];

In my testing, this led to the right estimates.

As an aside, I noticed that the plans in your question were different - one was properly parameterized and the other had a constant. Be very careful about how you're doing local testing with local variables and typed constants, especially if in the real world this will be done by parameters to stored procedures.

Also, be very careful about case - literals in the where clause like = 'a' and = 'A' will generate different plans if they don't get parameterized, because the query text is different. Never mind that in some collations they won't yield the same results. (And does that char(1) column need to allow NULL?)