Sql-server – SQL Server 2016 – unexpected privileged time bursts

performancesql server

MS SQL 2016 SP2 CU8 + Always On.

1 CLR module (occasional usage, no correlation with the issue)

I'm looking for methods to find the reason for such behavior. DBCC FREEPROCCACHE solves. But after approx. 10 such "solving" attempts I have to initiate failover. If not I may get an outage with stopping of query processing at all.

I tried to remove from the cache not all plans but some. Still have no stable progress here.

So, does anybody know a way to analyze privileged time consumption? And – yes – privileged time is consumed by the sqlserver service, confirmed.

The current workaround is running FREEPROCCACHE in case privileged time is greater when 7% at least for the 10 seconds and weekly manual failover. But this approach is not good enough for me.

In response to some requests for additional information:

  • The privileged time is not correlated with IO. Actually, I tried to find any correlation including perf counters, waits and spinlocks but found nothing.
  • That server has columnstore only in the "tech" DB with occasional usage patterns. I.e. we can count "no columnstores in the active usage".
  • The server is a physical machine, not a VM. >
  • "What are you actually trying to solve here?" When the problem occurs, I may get an outage with stopping of query processing at all.
    • For "end users" this looks like server stopped to process their queries.
  • Disk drivers are up to date
  • Disk and Network IO are OK and I see no correlation with the issue

Here are assorted perfmon counters at the beginning of a typical reproduction of this problem:

Typical reproduction #1 - at the beginning

And the same counters at the end of the repro:

Typical reproduction #2 - at the end

Best Answer

From an article on the internet:

(emphasis mine)

The Processor(_total)\ % Privileged Time counter shows the percent of time that the processor is spent executing in Kernel (or Privileged) mode. Privileged mode includes services interrupts inside Interrupt Service Routines (ISRs), executing Deferred Procedure Calls (DPCs), Device Driver calls and other kernel-mode functions of the Windows® Operating System.

Most of the time a processor should be executing User mode operations, a high % privileged time might indicate a poorly written device driver or a faulty piece of hardware.

Reference: Windows Performance Counters Explained (AppAdminTools)

You might want to have a look at the DLL executing in the context of the CLR. It may not be releasing all its resources.

Equally interesting the information from Microsoft | SQL Docs:

(emphasis mine)

Processor: % Privileged Time

Corresponds to the percentage of time the processor spends on execution of Microsoft Windows kernel commands, such as processing of SQL Server I//O requests. If this counter is consistently high when the Physical Disk counters are high, consider installing a faster or more efficient disk subsystem.

Note

Different disk controllers and drivers use different amounts of kernel processing time. Efficient controllers and drivers use less privileged time, leaving more processing time available for user applications, increasing overall throughput.

Reference: [Monitor CPU Usage][] (Microsoft | SQL Docs)

  1. Ensure your disk drivers are up to date
  2. Ensure your disk I/O is ok
  3. Remove any other services/applications from the SQL Server (instances)