Sql-server – Does a covering nonclustered index help to answer Select * queries

index-tuningsql serversql-server-2008-r2

Our application developers run a lot of SELECT * queries against our database system, unfortunately.

The missing index suggestions of sys.dm_db_missing_index contain a lot of cases that indicate to create a 100% covering non-clustered index for those cases. For example actually there is an index with 3 index columns and additional 25 included columns suggested. Other indexes with the key columns suggested are already in place but then require lookups.

I am a little surprised about this. However since the underlying queries are running very often with high costs and the suggested index comes with an impact of about 50% this is maybe an interesting option.

I know it is not best practice and bad design to support those queries, but I doubt this could be changed within adequate time, so I wonder if those indexes will help at all or produce more costs than benefit?

Is there a rule or do I just have to try / error the consequences?

  • Row counts are between 60,000 and 2.1 million.
  • The missing indexes DMVs tell me on average (over multiple servers) 2424 seek operations could be performed for the queries (this is a number counting up continuously for about 3 months and the query cost of 19 in average could be impacted positively of about 50%.
  • Removing SELECT * would be best, but I am asked to optimize it as is, and since SQL Server itself suggests this, I think it is worth thinking about it anyway.
  • The existing queries are not really slow, < 1 sec, but this is quite subjective as well I think. Execution plan tells the key lookups cost 80% of the whole, so I guess saving this might be desirable anyways.
  • I did some tests without / with the 100% covering index. Results:

    SELECT * : Costs before: 0,026 after: 0,00329 (new index used) = -88% improvement
    UPDATE: before: 0,043333 after: 0,0533 (ix used) = +20%
    INSERT: before: 0,1133 after: 0,1233 (ix not used)=equal
    DELETE: before: 0,645 after: 0,655 (ix used) =equal.

    Conclusion: It tunes the selects much more than it slows down the changes. Why not apply?

Best Answer

The missing index suggestions are opportunistic entries added whenever the optimizer happens to notice that an exact-match index for the current set of predicates it is considering do not exist on the base object.

The information recorded in the DMVs is intended to be a helpful input to the normal activities of a skilled database tuner; it is not intended to be interpreted as the result of a comprehensive analysis. The data recorded is intentionally a complete set of the keys and other columns needed, together with a brief indication of the reason it was added (equality, inequality, include).

The "impact" recorded is also very basic: it is the difference in estimated cost, according to the cost model used by the optimizer. As usual, this needs to be treated with a healthy dose of skepticism, because it is very unlikely that the actual performance characteristics of your hardware match the optimizer's model, even vaguely.

Note also that cost estimations depend heavily on cardinality (row count), average row size, and distribution information. If these are not reasonably close to the actual values, the "cost" is an even less reliable metric. Nevertheless, right now, estimated cost is all we have, so that's the number reported. Again, SQL Server is just trying to be as helpful as it can; it is up to the person reviewing the information to interpret it correctly.

Tests based on estimated costs are potentially informative to an early assessment of target areas for improvement, but they are no substitute for actual measurements of whatever metrics are important in your environment. The relative importance of duration, CPU usage, physical reads, memory usage ... and so on ... will be different in different circumstances.

The point really is to measure where the real problems are, record a baseline, make improvements, then measure and assess the benefits against the costs.

For a more thorough analysis than the missing index feature provides, you would need to run a representative workload through the Database Engine Tuning Advisor. Even there, the recommendations will still need to be assessed by a human, and something similar to the above baseline-assess-change-measure protocol followed.

That said, not all final decisions will be based on performance comparisons. There may be other considerations, like the avoidance of a particularly inconvenient lookup-related deadlock. Index tuning is as much art as science sometimes.