Sql-server – High CXPACKET and LATCH_EX waits

parallelismperformancequery-performancesql serverwait-types

I am having some performance issues with a data processing system which I am working on. I have gathered wait stats from a one hour peroid which show a large amount of CXPACKET and LATCH_EX wait events.

The system consists of 3 processing SQL Servers which do alot of number crunching and calculations and then feed the data into a central cluster server. The processing servers can have up to 6 jobs running each at any one time. These wait stats are for the central cluster which I think is causing a bottlneck. The central cluster server has 16 cores and 64GB RAM. MAXDOP is set to 0.

I guess the CXPACKET is from the multiple parallel queries running however I am not sure what the LATCH_EX wait event is indicating. From what I have read this could be a non-buffer wait?

Can anyone suggest what the cause of these kind of waits stats would be and what course of action I should be taking to investigate the root cause of this performance issue?

The top query results are the total wait stats and the bottom query result is the stats over the 1 hour period
SQL Wait sample

Best Answer

CXPACKET can be accompanied with a LATCH_XX (possibly with PAGEIOLATCH_XX or SOS_SCHEDULER_YIELD as well). If this is the case (and I believe it is, based on the question) then the MAXDOP value should be lowered to fit your hardware.

Besides this, here are some more recommended steps in diagnosing the cause of high CXPACKET wait stats values (before changing something on SQL Server):

  • Do not set MAXDOP to 1, as this is never the solution

  • Investigate the query and CXPACKET history to understand and determine whether it is something that occurred just once or twice, as it could be just the exception in the system that is normally working correctly

  • Check the indexes and statistics on tables used by the query and make sure they are up to date

  • Check the Cost Threshold for Parallelism (CTFP) and make sure that the value used is appropriate for your system

  • Check whether the CXPACKET is accompanied with a LCK_M_XX (usually accompanied with IO_COMPLETION and ASYNC_IO_COMPLETION). If this is the case, then parallelism is not the bottleneck. Troubleshoot those wait stats to find the root cause of the problem and solution

If you really need to understand the CXPACKET wait type in depth, I would advise reading the Troubleshooting the CXPACKET wait type in SQL Server article