Sql-server – Processor queue length but not very high CPU usage

performancesql serversql-server-2016

I've a virtual server running SQL Server 2016 (4 vcores) where most of the time there is a processor queue length of 4 (sometimes it goes up to 15) but the CPU is using 25% on average. There are about 3000 batches/sec.

Using a query found in an article of Glenn Berry found on SQLSkills.com I found out that there is an avg_task_count of 15 and a avg_runnable_task_count of 2 (but not constantly) :

SELECT AVG(current_tasks_count) AS [Avg Task Count], 
AVG(work_queue_count) AS [Avg Work Queue Count],
AVG(runnable_tasks_count) AS [Avg Runnable Task Count],
AVG(pending_disk_io_count) AS [Avg Pending DiskIO Count]
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE scheduler_id < 255 OPTION (RECOMPILE);

Is it correct to say that the processor queue length and the avg_task_count are an indication there are a lot of requests and the server has problems to process them? But what I don't understand is why isn't the cpu usage higher?

The physical host has a 25% CPU usage, there is no memory reservation but there is no memory over-commitment on the host.

Best Answer

While the recommendations below may not specifically reduce the processor queue length below 4 on a 4-CPU VM, they will help with overall performance, which is presumably why you're asking your question.

Best practices published by VMware and Microsoft recommend explicitly that you should set both a CPU reservation, and a memory reservation for any VM running SQL Server.

See Architecting Microsoft SQL Server on vSphere - Best Practices Guide by VMware for all the details about how to make SQL Server run reliably with the best performance possible.

I know you're not using Microsoft's Hyper-V product, but for future users who might see this post, see Running SQL Server with Hyper-V Dynamic Memory for their recommendations.

Specific recommendations the get the best performance out of SQL Server include:

  1. Configure "max server memory" to a setting that makes sense for your environment. This post may help you determine what that should be set to.

  2. Ensure you set "min server memory" to a value that makes sense. For production SQL Servers, it is "typical" to set this either to the same value as "max server memory", or to a value 1 or 2 GB lower than max server memory, in case the O/S experiences extreme memory pressure.

  3. Configure the service account used by SQL Server with the "Lock Pages in Memory" right. See this post for details.

  4. Ensure you have at least 2 cores configured for the VM. You probably want many more than 2, but setting up the VM with a single core is a recipe for disaster. For production SQL Server installs, I'd recommend at least 8 cores.

  5. In Hyper-V, configure the VM to be NUMA aware, and disable "Dynamic Memory". Many server folks will balk at this suggestion since it reduces the number of VMs that can be ran on any single host server; however it is critical for good performance for any service that consistently needs a lot of memory, such as a database management system. In VMWare, configure the VM with a memory reservation that matches the total amount of memory assigned to the VM. This prevents the VMware balloon driver from consuming memory in the VM, which would in turn reduce memory available to SQL Server, and thereby reduce performance.

  6. Ensure you use configure the VM with a high-queue SCSI interface and network interface - read the best practices docs linked above for specific recommendations.

As @SqlWorldWide mentioned in a comment on your question, you should look at this. Taken from that newsgroup post:

First of all anything below 10 is normal for the processor queue length counter in the perfmon (according to the perfmon's documentation)

Secondly if you have multiple processors you have to divide the queue length by the number of the processors. Because there is only one scheduler in the SMP system too.

And lastly the queue length is an instanteneous counter. It shows the number of ready threads right now, which is a sort of misnomer because IMHO it takes into consideration only the highest priority queue (note that there is queue per priority). Those threads will eventually block on something like I/O or synchronization mechanism.

Now the CPU usage counter is different: It is calculated by measuring the duration of the idle thread is active in the sample interval, and subtracting that time from interval duration Idle thread will run only when there is no ready thread with higher priority. And this is essentially the basic assumption of any preemtive system. I can have single thread (in addition to the Idle one) but if it runs code like this:

while ( true );

The CPU usage counter will show 100%. Note that in this hypothetical case the queue length is 1 :o)