Sql-server – Why query optimizer doesn’t use negative filter index

optimizationquery-performancesql serversql-server-2008

I have a table that has following structure, as you can see, there are one clustered index and two non-clustered indexes, one non-clustered index is IX_ParentId_Include, and another one is FLIX_ParentId_Include

they are same apart from the second index is filtered.

CREATE TABLE [dbo].[PhotoRepo] (
    [PhotoRepoId]              INT            IDENTITY (1, 1) NOT FOR REPLICATION NOT NULL,
    [TypeId]                   TINYINT        NULL,
    [ParentId]                 INT            NULL,
    [RemoteLocation]           VARCHAR (4000) NOT NULL,
    [UploadedAt]               DATETIME       NOT NULL,
    [UploadedBy]               VARCHAR (255)  NULL,
    [FileSize]                 INT            NOT NULL,
    [DefaultChild]             BIT            NOT NULL,
    [RemoteLocationUploadedAt] DATETIME       NULL,
    CONSTRAINT [PK_FileList] PRIMARY KEY NONCLUSTERED ([PhotoRepoId] ASC)
);


GO
CREATE CLUSTERED INDEX [IX_PhotoRepoId]
    ON [dbo].[PhotoRepo]([PhotoRepoId] ASC);


GO


CREATE NONCLUSTERED INDEX [IX_ParentId_Include]
    ON [dbo].[PhotoRepo]([ParentId] ASC)
    INCLUDE([RemoteLocation], [DefaultChild], [TypeId]);
GO

CREATE NONCLUSTERED INDEX [FLIX_ParentId_Include]
    ON [dbo].[PhotoRepo]([ParentId] ASC)
    INCLUDE([RemoteLocation], [DefaultChild], [TypeId])
    WHERE TypeId <> 7;

GO

Now when I run this query with execution plan enabled, the query optimizer doesn't use filter index, which I expect it should be more efficient one, it uses IX_ParentId_Include. It uses filter index only when i either force it to, or delete IX_ParentId_Include

Anyone knows why?

SELECT 
    *
FROM   PhotoRepo
WHERE  
    typeid <> 7 and
    parentid = 123

PS: there is 30% rows that have typeid = 7

Best Answer

Why do you think it should use the filtered index? You've used SELECT * in your query and your predicate involves more than just the filter condition. SQL Server is going to have to look up those columns in another index that covers it anyway (your SELECT * plan should include a key lookup to retrieve those other columns).

SQL Server will decide based on stats/cardinality (e.g. how many rows currently match the filter, how many rows total) and decide at that point which index it should use. In this case, it is probably a coin flip, because it can seek to the 19 rows that match the ParentId predicate, and not cost any more in I/O to do so. It might choose the filtered index if it represented a much smaller portion of the table than 30%. You should see what it does, though, if you add OPTION (RECOMPILE) to the query. Depending on your parameterization settings, it may be using a plan that was cached with a different parameter value (e.g. one that doesn't match the index filter).

In this case, I believe you are seeing this behavior because, well, RTM. Please install Service Pack 3 and consider Cumulative Update 16. I browsed quickly and didn't see anything relating to filtered indexes, but this could be a by-product of some other optimizer wonkiness that has been fixed in the meantime. In any case, I can't really think of any good reason why you should continue running RTM at this time.

Could also be related to these are other "bugs known gaps in functionality":

I've also blogged about a whole slew of limitations with filtered indexes, and pointed to a ton of related Connect items. Most of these remain unfixed and, for the most part, filtered indexes have received absolutely no attention since they were first introduced. If you're going to use these objects, and expect them to be used to speed up your queries, you'd be wise to familiarize yourself with all of these limitations, and consider using an index hint to force them to be used in some scenarios. Caveats with using an index hint, of course:

  1. It may not work better (you don't always know better than the optimizer); it may work better for some queries but not all. Usually SQL Server will use the filtered index if (a) it can, and (b) it deems it to be the most efficient access method.
  2. Queries will fail if the index is dropped, or re-created with a new name.