Sql-server – Low CPU Utilization but High Signal Waits

performancesql server

I have a server with 16 CPUs that is configured with a max degree of parallelism of 8 and a max worker threads setting of zero.

For a given hour, my signal waits were 20% yet my OS CPU utilization during that time never went over 25%. Can someone explain why my signal waits were so high?

My vendor has a best in class scoring system that expects us to be at 10% or less signal waits, or we get dinged. How can I go about fixing this (without adding additional CPUs)?

We do not have more than 8 CPUs per NUMA node, so Trace Flag 8048 does not apply.
The largest instance waits are CXPACKET (70%),then PREEMPTIVE_OS_PIPEOPS (20%)
cost threshold for parallelism is set to 50. Should I raise it? To what?
This is a physical machine (not a VM), dedicated to SQL Server.
I am using a monitoring tool to identify the most frequently run queries and procedures. Do I want to look at high CPU, high I/O, or high duration? Normally our app is I/O intensive so I tune high I/O. But since the issue is signal waits, do I need to look at high CPU?
I was hoping to avoid Max Vernon's recommendation to lower MAXDOP to 4, because the app does some warehouse style queries that need the extra threads.

Best Answer

_{Community Wiki answer generated from comments on the question by Aaron.}

Unless you are having performance issues, the high percentage of CXPACKET waits may only be an indicator that a large portion of the queries are going parallel and not actually a problem.

What is the CXPACKET wait type and how do you reduce it?

High CPU could be a indicator, but from what you have explained I would say look at queries with long durations but low CPU to start with. The CXPACKET wait is sometimes associated with a query waiting for all of the threads to finish before it can merge the results (data skew).

If you are able to modify the queries and procedures, you could set a higher MAXDOP for the warehouse tasks that need it, and set the global MAXDOP lower. However, I would only do that as a last resort. You only really want to make those types of explicit hints when you've exhausted all of the possibilities or you can't make changes to the code, queries or database schemas.

Related Solutions

Sql-server – CXPACKET Waits performance tune for SQL Server 2008

There is a lot of misunderstanding about CXPACKET. CXPACKET isn't the cause of your problems, it is a side effect. What CXPACKET means when you see it is that that thread of a parallel query is waiting for another thread of that query to do something. Just because you see a lot of CXPACKET waits doesn't mean there's a problem with the query, it means that there is a problem somewhere else. When you see CXPACKET waits you need to look at the other threads of the SPID and see what other waits are besides CXPACKET. That other wait is the problem.

To your specific issue, the reason that the run times are so crazy is probably because the SQL Server is generating a different plan on some days because the statistics are out of date (making the job run long). Then you either manually update statistics (or via a job) or auto stats kicks in and then the plan gets better and the job runs quickly again.

Once you've solved the stats problem you can start looking at other reasons why the job is running slow. The fact that you only have a single disk is not helping any for sure.

Sql-server – SQL Server – Anyone use SUMA, trace flag 8048, or trace flag 8015

This is an awesome post.

To answer your final question, I'd speculate that your answer is "yes".

That said, I probably would have pursued soft numa before resorting to the trace flags. I think you are right about the numa node allocation and that's could be at the root of your problem. Via soft numa, you could scale out the requests, depending on your count of numa nodes (4?) - to 4, if that's the correct number, and then assign, via ip address, each host to a specific numa node, in addition to that, I'd disable hyper threading. Combined, the issue would likely decrease, however, it would do so at the cost of fewer schedulers.

On a seperate thought, I'd look at forced parameterization - the fact that your load is driving your CPU so high is very interesting and it may be worth looking into that.

Lastly, on multi-numa node systems, I typically have the output of the following queries dumping to a table every N seconds. Makes for some interesting analysis when workload changes or trace flags are implemented:

SELECT getdate() as poll_time, node_id, node_state_desc, memory_node_id, online_scheduler_count, active_worker_count, avg_load_balance, idle_scheduler_count
FROM sys.dm_os_nodes WITH (NOLOCK) 
WHERE node_state_desc <> N'ONLINE DAC'

and

SELECT top 10 getdate() as sample_poll, wait_type, count (*)
FROM sys.dm_os_waiting_tasks
WHERE [wait_type] NOT IN
('CLR_SEMAPHORE','LAZYWRITER_SLEEP','RESOURCE_QUEUE','SLEEP_TASK','SLEEP_SYSTEMTASK',
'SQLTRACE_BUFFER_FLUSH','WAITFOR', 'BROKER_TASK_STOP',
'BROKER_RECEIVE_WAITFOR', 'OLEDB','CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT' ) 
GROUP BY wait_type
ORDER BY COUNT (*) DESC

Best Answer

Related Solutions

Sql-server – CXPACKET Waits performance tune for SQL Server 2008

Sql-server – SQL Server – Anyone use SUMA, trace flag 8048, or trace flag 8015

Related Question