SQL Server – Update Stats Takes Longer with 50% Sample Rate vs Full Scan

sql serversql-server-2012statistics

We have a home grown update stats job for large tables that essentially issues an UPDATE STATS command. Historically we had been defaulting to FULL SCAN, but recently we switched to SAMPLE 50 PERCENT. Oddly, the update stats command run times are drastically higher.

To give you an example, we have table 1 which has 6 statistics (3 index, 3 auto generated). Clustered index is 1.2 TB; NCI 1 is 2.7 GB; NCI 2 is 2.6 GB

The statistics were updated on the table 1 month ago with FULL SCAN and the command took 96 minutes. The statistics were updated last night with SAMPLE 50 PERCENT and the command took 593 minutes! Table row counts between the 2 runs were roughly the same.

I can see from sys.dm_db_stats_properties that the clustered index took all but 4 minutes of that time. My question is, why would reducing the sample rate by 50% caused the command to run almost 5 times longer?

There wasn't any blocking occuring while the command ran (according to SQL Sentry) and there weren't any resource bottlenecks that jump at me (CPU < 40%, IO latency < 10 by-in-large).

One thing I'm wondering about is if parallelism is in play – with a full scan, can SQL use parallelism, but with a sample % it is single threaded?

We're running SQL 2012 SP2 CU7

Best Answer

When you run stat updates WITH FULLSCAN, it goes parallel. I believe it has since SQL 2005.

From TechNet:

"Parallel statistics gathering for fullscan: For statistics gathered with fullscan, creation of a single statistics object can be done in parallel for both non-partitioned and partitioned tables."

Source: https://technet.microsoft.com/en-us/library/cc966419.aspx

When you don't use FULLSCAN the engine leverages the TABLESAMPLE option which is a parallelism inhibitor. In SQL 2016, sampled stat updates can also go parallel.

Source: Parallel Statistics Update