Sql-server – SP_WhoIsActive – RESOURCE_SEMAPHORE_QUERY_COMPILE

performanceperformance-tuningsql-server-2008-r2

Had an issue with one of our production servers earlier today. Many transactions were suspended with RESOURCE_SEMAPHORE_QUERY_COMPILE wait.
Have sp_whoisactive data logged to table every minute. Below is a snippet from it. It is ordered by USED_MEMORY.

Can someone help me understand the unit of measure for Used_Memory please or should i be looking at other measure to find the query that caused the memory pressure.

Updated Pic of the same session :

Best Answer

The sp_WhoIsActive documentation says:

The [used_memory] column is also reported based on a number of 8 KB pages. The number a combination of both procedure cache memory and workspace memory grant.

However, I need to point out a few things in your question:

Queries that are waiting on compilation memory aren't shown in your screenshot
Those queries waiting for compilation memory are looking for a different kind of memory (gateways) - I'm not saying you don't have memory pressure, I'm just saying I probably wouldn't sort by memory_used to track the problem down
Your screenshot shows a serious blocking issue - queries have been waiting minutes (not seconds) because they're blocked by others

In your case, I'd start by troubleshooting the lead blocker. Don't worry about the queries waiting to compile - they haven't even started yet. In your scenario, the last thing you should be doing is starting even more queries, which are likely to just get blocked anyway and consume more workspace memory while they wait. Get the lead blocker out of the way instead.

Related Solutions

SQL Server – I/O Requests Taking Longer than 15 Seconds

Note that all servers use the same NICs - Broadcom 5709Cs with up-to-date drivers. The servers themselves are Dell R610's.

Kyle Brandt has an opinion on Broadcom network cards which echoes my own (repeated) experience.

Broadcom, Die Mutha

My problems have always been associated with TCP Offload features and in 99% of cases disabling or switching to a-n-other network card has resolved the symptoms. One client that (as in your case) uses Dell servers, always orders separate Intel NICs and disables the on-board Broadcom cards on build.

As described in this MSDN blog post, I would start with disabling in the OS with:

netsh int ip set chimney DISABLED

IIRC it may be necessary to disable the features at the card driver level in some circumstances, it certainly won't hurt to do so.

Sql-server – SQL Server – Anyone use SUMA, trace flag 8048, or trace flag 8015

This is an awesome post.

To answer your final question, I'd speculate that your answer is "yes".

That said, I probably would have pursued soft numa before resorting to the trace flags. I think you are right about the numa node allocation and that's could be at the root of your problem. Via soft numa, you could scale out the requests, depending on your count of numa nodes (4?) - to 4, if that's the correct number, and then assign, via ip address, each host to a specific numa node, in addition to that, I'd disable hyper threading. Combined, the issue would likely decrease, however, it would do so at the cost of fewer schedulers.

On a seperate thought, I'd look at forced parameterization - the fact that your load is driving your CPU so high is very interesting and it may be worth looking into that.

Lastly, on multi-numa node systems, I typically have the output of the following queries dumping to a table every N seconds. Makes for some interesting analysis when workload changes or trace flags are implemented:

SELECT getdate() as poll_time, node_id, node_state_desc, memory_node_id, online_scheduler_count, active_worker_count, avg_load_balance, idle_scheduler_count
FROM sys.dm_os_nodes WITH (NOLOCK) 
WHERE node_state_desc <> N'ONLINE DAC'

and

SELECT top 10 getdate() as sample_poll, wait_type, count (*)
FROM sys.dm_os_waiting_tasks
WHERE [wait_type] NOT IN
('CLR_SEMAPHORE','LAZYWRITER_SLEEP','RESOURCE_QUEUE','SLEEP_TASK','SLEEP_SYSTEMTASK',
'SQLTRACE_BUFFER_FLUSH','WAITFOR', 'BROKER_TASK_STOP',
'BROKER_RECEIVE_WAITFOR', 'OLEDB','CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT' ) 
GROUP BY wait_type
ORDER BY COUNT (*) DESC

Best Answer

Related Solutions

SQL Server – I/O Requests Taking Longer than 15 Seconds

Sql-server – SQL Server – Anyone use SUMA, trace flag 8048, or trace flag 8015

Related Question