Sql-server – Why aggregate operator used after unique index scan

filtered-indexoptimizationsql server

I have a table with a unique index filtered for non-nullable values. In the query plan there is use of distinct. Is there a reason for this?

USE tempdb

CREATE TABLE T1( Id INT NOT NULL  IDENTITY PRIMARY KEY ,F1 INT , F2 INT )
go
CREATE UNIQUE NONCLUSTERED INDEX UK_T1 ON T1 (F1,F2) WHERE F1 IS NOT NULL AND F2 IS NOT NULL 
GO
INSERT INTO  T1(f1,F2) VALUES(1,1),(1,2),(2,1)

SELECT DISTINCT   F1,F2 FROM T1 WHERE F1 IS NOT NULL AND F2 IS NOT NULL 
SELECT  F1,F2 FROM T1 WHERE F1 IS NOT NULL AND F2 IS NOT NULL

query plan :
enter image description here

Best Answer

This is a known SQL Server query optimizer limitation. It has been reported to Microsoft, but the Connect item (no longer available) was closed Won't Fix.

There are additional consequences of this limitation, including some that I wrote about in Optimizer Limitations with Filtered Indexes, the summary is quoted below:

This post highlights two important optimizer limitations with filtered indexes:

Redundant join predicates can be necessary to match filtered indexes

Filtered unique indexes do not provide uniqueness information to the optimizer

In some cases it may be practical to simply add the redundant predicates to every query. The alternative is to encapsulate the desired implied predicates in an unindexed view. The hash match plan in this post was much better than the default plan, even though the optimizer ought to be able to find the slightly better merge join plan. Sometimes, you may need to index the view and use NOEXPAND hints (required anyway for Standard Edition instances). In still other circumstances, none of these approaches will be suitable.

Related Solutions

Sql-server – Why is the filtered index being ignored

Since SQL Server can skip NULL rows to start the range scan, the cost of either index is identical, so this is basically a coin toss for the optimizer. Look at the plans in SQL Sentry Plan Explorer* by default and when you hint the index (click to enlarge):

Since it's a toss-up, I don't know what benefit you'd get out of forcing SQL Server to choose one of two equally valid options.

* Disclaimer: I work for SQL Sentry.

Sql-server – SQL Server unique index, per subset of data, based on values of another column

Create a unique constraint over two columns.

alter table dbo.Emails add constraint UQ_Emails unique (CustomerId, EmailValue);

This will prevent duplicate emails per customer. Different customers can still have the same email.

From comments:

What if I have an IsDefault field too? In that case, the requirement would be:

There should be only one active email per customer, and each customer should not have duplicate emails"

You can do that if you add a filtered unique index:

create unique nonclustered index UX_Emails 
on dbo.Emails (CustomerID) 
where IsDefault = 1;

Best Answer

Related Solutions

Sql-server – Why is the filtered index being ignored

Sql-server – SQL Server unique index, per subset of data, based on values of another column

Related Question