I believe you will see this symptom if you have a LOT of large query plans that are fighting for memory in order to compile (this has very little to do with running the query itself). To hit this, I suspect you are using an ORM or some kind of application that generates many unique but relatively complex queries. SQL Server could be under memory pressure because of things like large query operations, but on further thought it is more likely just that your system is configured with far less memory than it needs (either there is never enough memory to satisfy all of the queries you're trying to compile, or there are other processes on the box that are stealing memory from SQL Server).
You can take a look at what SQL Server is configured with using:
EXEC sp_configure 'max server memory'; -- max configured in MB
SELECT counter_name, cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name IN
(
'Total Server Memory (KB)', -- max currently granted
'Target Server Memory (KB)' -- how much SQL Server wished it had
);
You can identify the cached plans that required the most compile memory with the following Jonathan Kehayias query, adapted slightly:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT TOP (10) CompileTime_ms, CompileCPU_ms, CompileMemory_KB,
qs.execution_count,
qs.total_elapsed_time/1000.0 AS duration_ms,
qs.total_worker_time/1000.0 as cputime_ms,
(qs.total_elapsed_time/qs.execution_count)/1000.0 AS avg_duration_ms,
(qs.total_worker_time/qs.execution_count)/1000.0 AS avg_cputime_ms,
qs.max_elapsed_time/1000.0 AS max_duration_ms,
qs.max_worker_time/1000.0 AS max_cputime_ms,
SUBSTRING(st.text, (qs.statement_start_offset / 2) + 1,
(CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(st.text) ELSE qs.statement_end_offset
END - qs.statement_start_offset) / 2 + 1) AS StmtText,
query_hash, query_plan_hash
FROM
(
SELECT
c.value('xs:hexBinary(substring((@QueryHash)[1],3))', 'varbinary(max)') AS QueryHash,
c.value('xs:hexBinary(substring((@QueryPlanHash)[1],3))', 'varbinary(max)') AS QueryPlanHash,
c.value('(QueryPlan/@CompileTime)[1]', 'int') AS CompileTime_ms,
c.value('(QueryPlan/@CompileCPU)[1]', 'int') AS CompileCPU_ms,
c.value('(QueryPlan/@CompileMemory)[1]', 'int') AS CompileMemory_KB,
qp.query_plan
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
CROSS APPLY qp.query_plan.nodes('ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS n(c)
) AS tab
JOIN sys.dm_exec_query_stats AS qs ON tab.QueryHash = qs.query_hash
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY CompileMemory_KB DESC
OPTION (RECOMPILE, MAXDOP 1);
You can see how the plan cache is being used with the following:
SELECT objtype, cacheobjtype,
AVG(size_in_bytes*1.0)/1024.0/1024.0,
MAX(size_in_bytes)/1024.0/1024.0,
SUM(size_in_bytes)/1024.0/1024.0,
COUNT(*)
FROM sys.dm_exec_cached_plans
GROUP BY GROUPING SETS ((),(objtype, cacheobjtype))
ORDER BY objtype, cacheobjtype;
When you are experiencing high semaphore waits, check to see if these query results vary significantly from during "normal" activity:
SELECT resource_semaphore_id, -- 0 = regular, 1 = "small query"
pool_id,
available_memory_kb,
total_memory_kb,
target_memory_kb
FROM sys.dm_exec_query_resource_semaphores;
SELECT StmtText = SUBSTRING(st.[text], (qs.statement_start_offset / 2) + 1,
(CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(st.text) ELSE qs.statement_end_offset
END - qs.statement_start_offset) / 2 + 1),
r.start_time, r.[status], DB_NAME(r.database_id), r.wait_type,
r.last_wait_type, r.total_elapsed_time, r.granted_query_memory,
m.requested_memory_kb, m.granted_memory_kb, m.required_memory_kb,
m.used_memory_kb
FROM sys.dm_exec_requests AS r
INNER JOIN sys.dm_exec_query_stats AS qs
ON r.plan_handle = qs.plan_handle
INNER JOIN sys.dm_exec_query_memory_grants AS m
ON r.request_id = m.request_id
AND r.plan_handle = m.plan_handle
CROSS APPLY sys.dm_exec_sql_text(r.plan_handle) AS st;
And you may also want to look and see how memory is distributed:
DBCC MEMORYSTATUS;
And there is some good information here about why you might be seeing a high number of compiles/recompiles (which will contribute to that wait):
http://technet.microsoft.com/en-us/library/ee343986(v=sql.100).aspx
http://technet.microsoft.com/en-us/library/cc293620.aspx
You can check for high compile/recompile counts using the following counters:
SELECT counter_name, cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name IN
(
'SQL Compilations/sec',
'SQL Re-Compilations/sec'
);
And you can check for internal memory pressure leading to evictions - non-zero counters here would indicate that something not good is going on with the plan cache:
SELECT * FROM sys.dm_os_memory_cache_clock_hands
WHERE [type] IN (N'CACHESTORE_SQLCP', N'CACHESTORE_OBJCP');
NOTE Most of these metrics don't have a magic "oh my gosh I need to panic or do something!" threshold. What you need to do is to take measurements during normal system activity, and determine where these thresholds are for your hardware, configuration and workload. When you panic do something is when two conditions are true:
- the metrics vary significantly from normal values; and,
- there is actually a performance problem occurring (like your CPU spikes) - but only if they are actually interfering with anything. Other than seeing the CPUs spike, are you seeing any other symptom? In other words, is the spike the symptom, or is the spike causing other symptoms? Would users of the system ever notice? A lot of people always go after their highest wait consumer, simply because it's the highest. Something is always going to be the highest wait consumer - you have to know that it's varying enough from normal activity that it indicates a problem or some significant change.
Optimize for ad hoc workloads
is a great setting for 99% of the workloads out there, but it will not be very helpful in reducing compilation costs - it is aimed at reducing plan cache bloat by preventing a single-use plan from storing the whole plan until it's been executed twice. Even when you only store the stub in the plan cache, you still have to compile the full plan for the execution of the query. Perhaps what @Kahn meant to recommend was setting the database level parameterization to forced, which will potentially provide better plan re-use (but it really depends on how unique all of these high-cost queries are).
Also some good information in this white paper about plan caching and compilation.
Best Answer
As mentioned
the count being only 9, i.e. the wait has only showed up for 9 times, during the run of gathering wait stats for a complete day, wont be a concern here for me.
However, its worth checking the other wait types
BACKUPIO
andBACKUPBUFFER
.This wait type generally shows up when you are taking the backup over the network with not a good NW connectivity or on the tape etc (i.e. any other extremely slow backup system)
We had this similar wait on one of our SQL server 2005 for backing up the database approx 200 GB in size. As the size being large and we did not had compression feature available at that time, witness lots of waits. we had to go with third party compression tool and saw those wait types count going down.So i believe it could be a good start if you start checking on the backups. Make sure you use the SQL server compression feature for backing up the databases.
However just to check if this wait type is causing any problems, you need to check for any issues in backup throughput.
May be you can go around and test the backups by doing native backups on same network as compared to other network and check for other counter values.
Also, you can refer A cause of high-duration ASYNC_IO_COMPLETION waits for more explanation by Paul on this wait.