I have a SQL Server database and stumbled across a table with a uniqueidentifier primary key named ActivityId and columns StateCode (int) and ActivityTypeCode (int) along with many others. The tables has 13M rows and there are 94 unique values for ActivityTypeCode with row counts ranging from 1 to 4M. OK, here’s the weird part: I have 97 different filtered indexes on (StateCode, ActivityId) where the only difference is the where clause on (ActivityTypeCode=(_some_value_)).
To my eyes, all 97 indexes could be dropped and replaced with one index on just StateCode (since ActivityID is the primary key and is already included as a hidden column). My question is: Is there any reason why I would want to keep multiple filtered indexes? Could they reduce deadlocks somehow? I’ve done some testing and I don’t see any reason to have so many indexes. Thoughts?
Ken
[Edit]
Based on the excellent answer below, I've done some testing and it does not appear that the filtered indexes prevent blocking on even simple queries. My indexes are of the form:
CREATE INDEX IX_StateCode_ActivityID_ActivityTypeCode_Filtered_0 ON dbo.Activities(StateCode,ActivityId) WHERE ActivityTypeCode = 0;
If I go into one database connection and issue the following:
Begin transaction;
Update Activities set StateCode=888 where ActivityTypeCode=1
Then go into a second connection and issue this:
Update Activities set StateCode=999 where ActivityTypeCode=3
The second connection hangs. It seems these different indexes do not prevent blocking so I don't see any reason to keep them. Given that there are indexes for ActivityTypeCode values that do not exist, I suspect they were all generated by some sort of development tool. FYI: This is from a third party vendor.
Thanks everyone for your help.
[Edit #2] Upon detail examiniation, not all indexes are the same. Below are the combinations. Keep in mind ActivityID is the primary key.
StateCode, ActivityID
StateCode, CreatedOn, ActivityID
StateCode, CreatedOn, StatusCode, ActivityID
StateCode, StatusCode, CreatedOn, Activity ID
I don't think any of this will prevent me from replacing them all with a single index – it will just be a bit larger than originally anticipated.
Best Answer
Like Brent said in the comments, this will be hard to give a good anser without many more details.
And as Jonathan Fite said, if the queries are using the filtered indexes, they could be beneficial.
Testing
This answer will simply be some tests and considerations in the hope of making your decision easier. There could be reasons for having all these filtered indexes depending on your workload, or they could be hoarding space and resources without having benefits. My amount of data will be smaller than yours, testing on
SQL Server 2017
. As always YMMV.Trying to recreate parts of your setup
I went for 94 indexes, for the 94 distinct values in
ActivityTypeCode
The space used
Filtered index testing
In your question you did not specify if
ActivityTypeCode
is included in the filtered indexes. If they are not included, only exact filter matches will be used. More on that here.Parameterized queries without OPTION RECOMPILE
When using parameterized queries without
OPTION(RECOMPILE)
, the parameter will not be known to sql server at runtime.This means that you will not be able to use your filtered indexes. We would not be able to reuse the plan with different parameters if we were able to use filtered indexes.
This is also true for stored procedures like the following:
Parameterized queries with OPTION RECOMPILE
As stated earlier,
OPTION(RECOMPILE)
enables us to use filtered indexes when creating the plan.You could add the same
OPTION(RECOMPILE)
to the procedure talked about earlier.You can also resort to dynamic SQL to pass parts of the filtering as a literal and part of the filtering as a parameterized query. More on that here.
Estimates
The one issue with the plan that uses the filtered index is that the query estimates are way higher than the actuals:
You could resolve these estimates by creating / changing the filtered index so that it has ActivityTypeCode as one of the key columns instead of ActivityId.
This is only true when also filtering on
StateCode
.I was not able to create custom statistics or get sql server to use custom statistics to fix this. There might be a way to do this without having to change the index
Filtered index < - > regular index(es)
Assumptions
Lets assume that the queries can leverage the filtered indexes and that the
ActivityTypeCode
is included in the index.The indexes
Proposed index by OP
This index seems strange to me, if you are only filtering on
StateCode
, why would the filtered indexes onActivityTypeCode
even exist?Which is why we will focus on two other indexes in this example:
The space used
Testing
Filtering on
StateCode
&ActivityTypeCode
Running the query that used the filtered index earlier:
The new index is used and the estimates are correct.
We saw earlier that the estimated for the filtered index were ~4M rows for these specific parameters. This could have been circumvented by creating a new index
Filtering only on
ActivityTypeCode
When only filtering on
ActivityTypeCode
it starts to get interestingThe non filtered index is used
While in theory, the entire filtered index could be scanned without having to apply any filtering:
And both estimated rows are correct.
Even when filtering on the parameter that should return +4M rows
@P1=2
the non filtered index + seek is chosen.This is due to the estimated rows and the index seek <-> scan tipping point.
When adding 4M extra rows, the scan is chosen.
If it comes down to this, I would be fine with dropping the filtered indexes and using the regular indexes specified above. But again, this is based on one query on own test data. More details would be needed to get an answer closer aligned with your dataset & workload.
What does the testing tell us?
The testing results only gives us information about my own tests and my own queries.
Some takeaways are that:
OPTION(RECOMPILE)
you will be able to use filtered indexes and might get better query plans at the cost of recompilationsThe rest is up to you to figure out.