Sql-server – Find statements related to sys.dm_db_missing_index… tables (SQL Server)

dmvindexsql-server-2008

I have read up different articles related to SQL Server 2008 on how to find out missing indexes using the tables sys.dm_db_missing_index_groups, sys.dm_db_missing_index_group_stats etc.

One thing I'd like to know: Is it possible to extract the queries that would benefit from these indexes?

Thanks all!

Best Answer

One approach that I like to use, is to focus just on the most frequently executed queries and not all queries that might benefit from having indexes. The reason for this is the impact and cost (CPU time, I/O) that additional indexes would have on DML (DATA MODIFICATION LANGUAGE) statements.

Imagine a table customers that has 4 columns (id_customer, first_name, last_name, address) that has no nonclustered indexes.

The query UPDATE customers SET first_name = 'fred' where id_customer = '1' only updates the table customers. If you add a nonclustered index on first_name and last_name called idx_cust_first_last, that same update statement above now has to modify not only the table customers but also the index idx_cust_first_last for the affected row or rows. This results in an additional IO and CPU cost that isn't there without the index.

In short, just because an index shows up in the missing index views, doesn't mean you should add it until you are certain that the benefit gained from that index outweighs the evenutal cost to the modifications made to your data.

Here is a link to an article written by Kimberly Tripp on the subject which you may find useful.

That said, here is a query I use to display queries that are missing indexes in desending order based on total cumulated execution time. So queries that are taking a long time to finish show up on top of the list. In my experience, the execution_count column is extremely important. This is because the execution count shows just how "popular" that query truly is. If a query takes a very long time to run, shown by the total_elapsed_time column but was only executed five times then adding an index may not be useful because that query is only run occassionaly.

Here is the script. Its not mine.

WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS sp)
SELECT p.query_plan.value(N'(sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Database)[1]', 'NVARCHAR(256)') AS DatabaseName
,s.sql_handle
,s.total_elapsed_time
,s.last_execution_time
,s.execution_count
,s.total_logical_writes
,s.total_logical_reads
,s.min_elapsed_time
,s.max_elapsed_time
,p.query_plan
,p.query_plan.value(N'(sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Table)[1]', 'NVARCHAR(256)') AS TableName
,p.query_plan.value(N'(/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Schema)[1]', 'NVARCHAR(256)') AS SchemaName
,p.query_plan.value(N'(/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/@Impact)[1]', 'DECIMAL(6,4)') AS ProjectedImpact
,ColumnGroup.value('./@Usage', 'NVARCHAR(256)') AS ColumnGroupUsage
,ColumnGroupColumn.value('./@Name', 'NVARCHAR(256)') AS ColumnName
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) AS p
CROSS APPLY p.query_plan.nodes('/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/sp:ColumnGroup') AS t1 (ColumnGroup)
CROSS APPLY t1.ColumnGroup.nodes('./sp:Column') AS t2 (ColumnGroupColumn)
WHERE p.query_plan.exist(N'/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan//sp:MissingIndexes') = 1
ORDER BY s.total_elapsed_time DESC

And here is an example of the output

enter image description here

This queries displayed here were either currently being executed or already in cache. If the cache has recently been cleared by a server restart or by a dbcc command, then there may not be much information to display.

Microsoft's documentation on sys.dm_exec_query_plan and sys.dm_exec_query_stats which are being used to return information about cached query plans