SQL Server – Dealing with CXPACKET Waits by Setting Cost Threshold for Parallelism

parallelismperformanceperformance-tuningquery-performancesql-server-2008

As a follow-up to my previous question on perf troubleshooting a Sharepoint site, I was wondering if I could do something about the CXPACKET waits.

I know the knee-jerk solution is to turn off all parallelism by setting MAXDOP to 1 – sounds like a bad idea. But another idea is to increase the cost threshold before parallelism kicks in. The default of 5 for the cost of an execution plan is fairly low.

So I was wondering if there's a query out there already written that would find me the queries with the highest execution plan cost (I know you can find those with the highest duration of execution and so on – but is the execution plan cost retrievable somewhere, too?) and that would also tell me if such a query has been executed in parallel.

Does anyone have such a script at hand, or can point me in the direction of the relevant DMV, DMF or other system catalog views to find this out?

Best Answer

CXPACKET is never a cause; it gets all the blame, but it's always a symptom of something else. You need to catch these queries in the act and figure out what "something else" is. It might be different from query to query, and turning off parallelism altogether is - as you've suggested - unnecessary overkill in most cases. But it is often the least amount of work, which is why it is such a prevalent "fix."

If you can get an actual plan for a query that seems to be responsible for high CXPACKET waits, load it into SentryOne Plan Explorer. There's usually a reason behind this; we show which parallel operations led to thread skew, and you can easily correlate that to estimates that are off (we highlight operations with estimates that are off by a at least certain threshold). Usually the underlying problem is really bad/out-of-date (or unavailable) statistics.

Unfortunately what you'll find in sys.dm_exec_cached_plans are estimated plans. They won't tell you whether the plan went parallel when it was actually used, because the actual plan is not what's cached. In some cases you expect to see both a serial and parallel plan for the same query; this is not how SQL Server deals with the situation for parallel plans that might be parallel at runtime. (Lots of information about that here.)