Sql-server – Update stats w/full scan on SQL Server 2014 uses 100% cpu, on 2008 R2, 15%

sql serversql server 2014sql-server-2008-r2

Why does full scan update statistics use 100% of CPU on SQL Server 2014 when it uses maybe 20% of CPU on SQL Server 2008 R2, for the same tables, with similar hardware capability?

I have been looking at MAXDOP, other options, and really see nothing that stands out. I realize that there could be settings that could cause this, but the settings are very similar for both databases (for example, MAXDOP is 4 for both, with both having multiple cores). Both are Enterprise Edition.

Is there something "different" in SQL Server 2014 versus SQL Server 2008 R2 that could explain this? I have the memory option at 90% for both servers. Any thoughts on what to look for?

I run update statistics with full (100%) scan once a week on two servers using SQL Server 2008 R2/SP3 and SQL Server 2014/SP2, and the databases have the same structure. On the 2008 R2 server the update stats of two very large tables takes several hours, which is what I expect, but the CPU stays under 20% or so utilization the entire time. On the 2014 server, though, the CPU goes to 100% for about 40 minutes. The tables are a bit smaller on the 2014 server. I see this by using SQL Monitor analysis menus.

Here is the output of the Ola log file on the 2014 SQL Server, CPU goes to 100% from about 2:10 to 2:45:

Date and time: 2017-06-24 02:10:20  
Command: UPDATE STATISTICS [InVA].[dbo].[AuditField] [_WA_Sys_00000005_15502E78] WITH FULLSCAN  
Outcome: Succeeded  
Duration: 00:07:48  
Date and time: 2017-06-24 02:18:08  
Date and time: 2017-06-24 02:18:08  
Command: UPDATE STATISTICS [InVA].[dbo].[AuditField] [_WA_Sys_00000006_15502E78] WITH FULLSCAN  
Outcome: Succeeded  
Duration: 00:32:22  
Date and time: 2017-06-24 02:50:30  

Here is the output of the Ola log file on the 2008 R2 SQL Server for the two stats above, but CPU goes to maybe 15%:

Date and time: 2017-06-24 03:30:32  
Command: UPDATE STATISTICS [InGA].[dbo].[AuditField] [_WA_Sys_00000003_0425A276] WITH FULLSCAN  
Outcome: Succeeded  
Duration: 00:05:00  
Date and time: 2017-06-24 03:35:32  
Date and time: 2017-06-24 03:35:32  
Command: UPDATE STATISTICS [InGA].[dbo].[AuditField] [_WA_Sys_00000004_0425A276] WITH FULLSCAN  
Outcome: Succeeded  
Duration: 00:52:31  
Date and time: 2017-06-24 04:28:03

I cannot run them with server maxdop=1 since that eliminates all parallel plan generation, and that could hurt the application. I plan to go the opposite direction and increase it to 8 (there are 16 cores on the box) and see what happens. May go faster to reduce the length of time the CPU is pegged. This job runs while users are mostly gone.

Best Answer

Statistics updates can go parallel based on a lot of different options in SQL Server:

  • Cost Threshold for Parallelism - a query must be this high to ride the parallelism train. Your two servers could have different CTFP settings that cause the 2008R2 update to go single-threaded, whereas the 2014 one can go multi-threaded.
  • Max Degree of Parallelism - dictates how many cores a query can use, at most, if SQL Server decides to parallelize it that far. The 2008R2 box might have MAXDOP set to 1, whereas the 2014 box might have it set to the default of 0 (unlimited.)
  • Resource Governor - this Enterprise Edition feature lets you throttle different groups of users or applications to different MAXDOPs.

In later versions of SQL Server (2016 & newer), this gets even more complicated:

  • Database-level scoped options - you can right-click on a database, go into properties, and set the MAXDOP level for that database.
  • Statistics parallelism hints - starting in 2016 SP2, the statistics creation & update statements accept MAXDOP hints

As you noted, your 2008R2 one is going single-threaded, whereas the 2014 one is going multi-threaded (thus finishing faster, but maxing out CPU while it runs.)

To find the right balance for your stats jobs, think about:

  • What other workloads are happening in the database at the same time? Can you afford to dominate the box during brief periods? For example, in data warehouses that sit idle during most weekend hours, I've seen folks crunch hard on stats updates with fullscans when they know nobody's using the server anyway. In heavy-duty transactional environments, you have to start using less impact for maintenance tasks if users complain even during midnight periods.
  • Is fullscan really necessary? Are you seeing queries that only get good plans when you use the fullscan option, or are you merely doing it as a best-practices thing? As your database grows, if your hardware investments don't keep pace, you may have to start making tradeoffs in stats sampling rather than doing fullscans.
  • Can you update stats less often? For example, update 1/4 of your stats each weekend, and then every month, everything will get stats updates?
  • Can you update less objects? Often I see folks updating stats even on huge audit or archive tables simply because a few dozen new inserts were done, but those inserts don't really affect the stats on the table (and nobody's querying it anyway.)