One approach that I like to use, is to focus just on the most frequently executed queries and not all queries that might benefit from having indexes. The reason for this is the impact and cost (CPU time, I/O) that additional indexes would have on DML (DATA MODIFICATION LANGUAGE) statements.
Imagine a table customers that has 4 columns (id_customer, first_name, last_name, address) that has no nonclustered indexes.
The query UPDATE customers SET first_name = 'fred' where id_customer = '1'
only updates the table customers. If you add a nonclustered index on first_name and last_name called idx_cust_first_last, that same update statement above now has to modify not only the table customers but also the index idx_cust_first_last for the affected row or rows. This results in an additional IO and CPU cost that isn't there without the index.
In short, just because an index shows up in the missing index views, doesn't mean you should add it until you are certain that the benefit gained from that index outweighs the evenutal cost to the modifications made to your data.
Here is a link to an article written by Kimberly Tripp on the subject which you may find useful.
That said, here is a query I use to display queries that are missing indexes in desending order based on total cumulated execution time. So queries that are taking a long time to finish show up on top of the list. In my experience, the execution_count column is extremely important. This is because the execution count shows just how "popular" that query truly is. If a query takes a very long time to run, shown by the total_elapsed_time column but was only executed five times then adding an index may not be useful because that query is only run occassionaly.
Here is the script. Its not mine.
WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS sp)
SELECT p.query_plan.value(N'(sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Database)[1]', 'NVARCHAR(256)') AS DatabaseName
,s.sql_handle
,s.total_elapsed_time
,s.last_execution_time
,s.execution_count
,s.total_logical_writes
,s.total_logical_reads
,s.min_elapsed_time
,s.max_elapsed_time
,p.query_plan
,p.query_plan.value(N'(sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Table)[1]', 'NVARCHAR(256)') AS TableName
,p.query_plan.value(N'(/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Schema)[1]', 'NVARCHAR(256)') AS SchemaName
,p.query_plan.value(N'(/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/@Impact)[1]', 'DECIMAL(6,4)') AS ProjectedImpact
,ColumnGroup.value('./@Usage', 'NVARCHAR(256)') AS ColumnGroupUsage
,ColumnGroupColumn.value('./@Name', 'NVARCHAR(256)') AS ColumnName
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) AS p
CROSS APPLY p.query_plan.nodes('/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/sp:ColumnGroup') AS t1 (ColumnGroup)
CROSS APPLY t1.ColumnGroup.nodes('./sp:Column') AS t2 (ColumnGroupColumn)
WHERE p.query_plan.exist(N'/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan//sp:MissingIndexes') = 1
ORDER BY s.total_elapsed_time DESC
And here is an example of the output
This queries displayed here were either currently being executed or already in cache. If the cache has recently been cleared by a server restart or by a dbcc command, then there may not be much information to display.
Microsoft's documentation on sys.dm_exec_query_plan and sys.dm_exec_query_stats which are being used to return information about cached query plans
To answer the question: How to find the offending query:
Since the spikes in the graph you posted last for several minutes you have plenty of time to use the following method:
Download sysinternals process explorer
- start process explorer and find the SQL Server process.
- right click and select properties
- look at the thread tab.
- Sort on the CPU column and note the thread id (TID) that is consuming the most CPU.
Use this query and lookup the query that is currently being executed by that thread:
SELECT r.session_id, st.text, qp.query_plan
FROM sys.dm_os_threads AS ot
JOIN sys.dm_os_tasks AS t
ON t.worker_address = ot.worker_address
JOIN sys.dm_exec_requests AS r
ON t.session_id = r.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) AS qp
WHERE os_thread_id = <thread id>
Best Answer
The more appropriate way of finding fragmentation details for SQL Server is by using DMV sys.dm_db_index_physical_stats. Please note since DMV's came from SQL Server 2005 onward below script will work for 2005 onward only
For statistics you also have sys.dm_db_stats_properties IF you are using SQL Server 2008 R2 Sp2 and above you can use below script. Source