Sql-server – SQL Server 2008 – Question about index behaviour

I have a general question about advanced issues regarding index behaviour.

In short, roughly one year ago, we dynamically dropped and recreated a number of indexes with different filters, but using the same name as before. Our tests seemed to run ok, but we later found out that our testing environments resolved to using the plans according to the old index definitions, while our production environment used the new ones. The tests had therefore produced the wrong results, and we changed these indexes back to the old versions in production, where the old filter definitions were immediately applied to the plans.

This worked fine for the past 6 months. Now however, we have the opposite problem. Our production environment has suddenly fallen back to using the plans for these 6 months old, falsely filtered indexes, where until a few weeks back it was still using the ones the existing indexes are supposed to use.

We've tested fixing this by again dropping these problem indexes, and this time creating them with a different name entirely. This seems to be working fine.

But my question is this: Considering that the indexes have been dropped (not just renamed), then created with the same name, and the query plan cache has been cleared AND the statistics updated several times, how is it possible that SQL Server seems to have a mind of its own and now has resolved to using ancient plans that I didn't even know could have existed any more?

Basically, how exactly does SQL Server store and use the data associated with indexes and their plans? How can you force SQL Server to clear that cache, wherever it may be, completely so that it can absolutely not simply decide to use ancient detrimental plans simply based on the same index name? How does all of this work, so that we can understand it and will never have to deal with this issue again?

Thanks!

EDIT:

It's now all but confirmed that these 6 months old filtered indexes were the reason. I restored the DB to a testing environment and ran problem queries against it, providing the wrong execution plans compared to another ancient testing environment. Checking each and every one of the indexes used by the older, functional environment, every single index definition was identical. I updated the stats, reorganized and rebuilt the indexes so none of them had fragmentation above 35, cleared the query plan cache, and still the problem persisted.

I then proceeded to find specifically those indexes involved in the query that were briefly filtered 6 months ago, dropped them and recreated them (the first time they still didn't work right, after the second attempt after another restore, they DID start working right). After this, I dropped and created the index with a different name that hasn't been used before, otherwise using the same definitions. This fixed it every time, and the execution planner would now use the correct indexes. Then I dropped these, created the indexes again using the filter definitions from 6 months back, again with a new name to ensure the query planner would use the new definitions instead of some ghost statistics from older ones. The failed plan produced after these indexes was identical to the ones initially used when all definitions fragmentation, statistics etc had been cleared and checked. Proving once and for all that despite all the seemingly available metadata, the execution planner was working under the assumption that the indexes were filtered and thus not usable, all along.

Do any of you know what could be going on? Or is this something that should be reported as a bug, regardless of how rare it might be. Because the implications of this behavior and the effects it's already had, are severe enough that currently I'm considering logging every index name just so none of them will ever be reused. Otherwise this paints a grim picture in that SQL Server may store ancient statistics in the background and start using them at any time, completely nullifying the structure and purpose of new indexes which may be completely business-critical. While it seems extremely likely that the failover had something to do with this, I still can't understand how perfectly working indexes suddenly be replaced by outdated and completely wrong definitions and statistics to the point where no manner of rebuilding them or updating the applicable metadata would help. With no real way to even diagnose that this has happened, other than a sudden decrease in performance, and quirky behaviour on the query planner's part.

Best Answer

SQL Server does not save previous index definitions for reuse at a later time. So query plans are based on the most recent statistics that were used for a compile.

Although stored procedures do not recompile constantly, they will in time recompile. You can also individually recompile procedures. The "sp_recompile" can be used to make your stored procedures recompile the next time they are used.

I see that you mention recreating statistics and recreating indexes, both of which help to get a correct set of statistics. You do not mention doing index reorganization, which also can be part maintaining the health of the indexes.

However, if the definition of the index has been changed to an earlier version, then it means that someone or some process changed them.

Best Answer

Related Solutions

Sql-server – FTS – Index rebuild causing CPU issues (SQL Server 2008 R2)

Sql-server – Filtered index statistics refresh threshold

Related Question