Sql-server – SQL Server experiencing THREADPOOL waits while still having plenty of available workers

sql serversql-server-2016wait-types

I recently took a look at my production SQL Server's cumulative threadpool waits, and it's literally days worth of waits. So I started doing some digging.

I'm on a 32 core, 64 bit server (running on EC2). Max workers is correctly set to 0 in config, and the following DMV reports the correct, expected number:

SELECT max_workers_count FROM sys.dm_os_sys_info

Output: 960

Looking at my total number of workers, I'm rarely seeing any over 300-400:

SELECT count(*) FROM sys.dm_os_workers

Output: 372

And yet, if I run the following DMV, I always have threads listed:

SELECT * FROM sys.dm_os_waiting_tasks
WHERE wait_type = 'THREADPOOL'

Waiting Tasks -- Threadpool

SELECT dow.state , dow.is_preemptive , dow.is_sick , dow.is_in_polling_io_completion_routine , [Num Workers] = COUNT(1) 
FROM sys.dm_os_workers dow 
GROUP BY dow.state , dow.is_preemptive , dow.is_sick , dow.is_in_polling_io_completion_routine;

enter image description here

The waits in milliseconds are usually pretty short, 20-200 ms generally, and they disappear quickly as well, but they are adding to the overall cumulative threadpool figure. They also never have blocking sessions.

I am stumped as to why anything is encountering a threadpool wait when I have so many available workers. Shouldn't the hundreds of available workers be handling these requests instantly without any threads going to threadpool?

I would appreciate any input or direction here.

SQL Server Version: SQL Server 2016 (SP2-CU8) (KB4505830) – 13.0.5426.0 (X64) Enterprise Edition: (64-bit) on Windows Server 2016 Datacenter 10.0 (Build 14393: ) (Hypervisor)

Cost Threshold for Parallelism is set to 1400.
Hyperthreading is not enabled.
Maximum Degree of Parallelism is set to 8.

Best Answer

If you have the max degree of parallelism set to zero, which is the default, then queries that might benefit from parallelism will consume 32 threads. With the number of workers set to 960, again the default for your setup, you could only run 30 simultaneous parallel queries.

I'd recommend setting max degree of parallelism to something sane, like 8.