Sql-server – Increased waits during checkpoint after upgrading to better storage

checkpointsql serversql-server-2012storagewaits

When we migrated from an older all flash array, to a newer all flash array (different, but well established vendor), we started seeing increased waits in SQL Sentry during checkpoints.

Version: SQL Server 2012 Sp4

On our old storage our waits were around 2k with "spikes" to 2500 during a checkpoint, with the new storage the spikes are typically 10k with peaks near 50k. Sentry points us more toward PAGEIOLATCH watis. Doing our own analysis, it seems to be a combination of PAGEIOLATCH and PAGELATCH waits. Using Perfmon, we can generally say the more pages we checkpoint, the more waits we get, but we're only flushing ~125 mb during the checkpoint. Our workload is mostly writes (inserts/updates primarily).

The storage vendor has proven to us that the Fibre Channel direct attached array is responding sub 1 ms during these checkpoint events. The HBA also confirms the array's numbers. We also do not believe it to be an HBA queuing issue as the queue depth was never above 8. We have also tried a newer HBA, changing the ZIO, execution throttle, and queue depth settings to no avail. We've also increased the server's memory from 500 GB to 1 TB with no change. During the checkpoint process we do see 2 – 4 individual cores (of 16) spike to 100%, but overall CPU is around 20%. The BIOS is set to high performance as well. Interestingly though, we do see the CPUs are generally in a C2 sleep state even though we've disabled that, so we're still researching why the sleep state goes past C1.

We can see that almost all the waits are on data pages with an occasional PFS of DCM page type. Waits are in user DBs, not tempdb. We also see that the waits are over several data pages, with some SPIDs waiting on the same page. The database design does have a couple of insert hot spots, but the same design was in place with the old storage.

Running a loop of this query 100 times, we were able to catch how many SPIDs were waiting on disk vs memory

SELECT
    [owt].[wait_type], count(*) as waitcount
FROM sys.dm_os_waiting_tasks [owt]
WHERE [owt].[wait_type] LIKE 'PAGE%'
group by [owt].[wait_type]
order by 1
GO 100

enter image description here

The "nice" thing is we can easily reproduce the issue in our perf environment which has the same model array and similar server specs. I'd appreciate any thoughts on where else to look or how to narrow down the problem. Right now our next tests include: a new server with newer motherboard and more CPUs; disabling SIOS datakeeper (even though this has been in place with old storage); different HBA brand.

exec sp_Blitz @outputtype = 'markdown'

Priority 5: Reliability:
– Dangerous Third Party Modules – Sophos Limited – Sophos Buffer Overrun Protection – SOPHOS~2.DLL – suspected dangerous third party module is installed.

Priority 200: Informational:
– Cluster Node – This is a node in a cluster.
– TraceFlag On
– Trace flag 1117 is enabled globally.
– Trace flag 1118 is enabled globally.
– Trace flag 3226 is enabled globally.

Priority 200: Licensing:
– Enterprise Edition Features In Use
* xxxxx – The [xxxxxx] database is using Compression. If this database is restored onto a Standard Edition server, the restore will fail on versions prior to 2016 SP1.
* xxxxx – The [xxxxxx] database is using Partitioning. If this database is restored onto a Standard Edition server, the restore will fail on versions prior to 2016 SP1.

Priority 240: Wait Stats:
– No Significant Waits Detected – This server might be just sitting around idle, or someone may have cleared wait stats recently.

Priority 250: Server Info:
– Hardware – Logical processors: 16. Physical memory: 512GB.
– Hardware – NUMA Config
– Node: 0 State: ONLINE Online schedulers: 8 Offline schedulers: 0 Processor Group: 0 Memory node: 0 Memory VAS Reserved GB: 1177
– Node: 1 State: ONLINE Online schedulers: 8 Offline schedulers: 0 Processor Group: 0 Memory node: 1 Memory VAS Reserved GB: 0
– Power Plan – Your server has 3.50GHz CPUs, and is in high performance power mode
– Server Last Restart – Jul 4 2018 4:56AM
– SQL Server Last Restart – Jul 5 2018 5:11AM
– SQL Server Service – Version: 11.0.7462.6. Patch Level: SP4. Edition: Enterprise Edition (64-bit). Availability Groups Enabled: 1. Availability Groups Manager Status: 1
– Virtual Server – Type: (HYPERVISOR)
– Windows Version – You're running a pretty modern version of Windows: Server 2012R2 era, version 6.3

Priority 200: Non-Default Server Config:
– Agent XPs – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.
– backup compression default – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.
– blocked process threshold (s) – This sp_configure option has been changed. Its default value is 0 and it has been set to 20.
– cost threshold for parallelism – This sp_configure option has been changed. Its default value is 5 and it has been set to 30.
– Database Mail XPs – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.
– max degree of parallelism – This sp_configure option has been changed. Its default value is 0 and it has been set to 8.
– max server memory (MB) – This sp_configure option has been changed. Its default value is 2147483647 and it has been set to 496640.
– min server memory (MB) – This sp_configure option has been changed. Its default value is 0 and it has been set to 8196.
– optimize for ad hoc workloads – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.
– remote access – This sp_configure option has been changed. Its default value is 1 and it has been set to 0.
– remote admin connections – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.
– scan for startup procs – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.
– show advanced options – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.
– xp_cmdshell – This sp_configure option has been changed. Its default value is 0 and it has been set to 1.

Best Answer

Hmm. You show spids waiting during checkpoint, but not how long waiting on average/in aggregate (which, honestly, would be all I care about). Do a differential wait stats analysis to see if duration is of concern. Also, what exactly are the two waits in your chart? If you are getting lots of memory grant waits with 1TB of RAM in play we need to have a different discussion. :-D

The 125MB write speed during checkpoint: is that JUST checkpoint writes or ALL? Either way it seems low for all-flash storage. Did you benchmark said storage for various write patterns and if so what numnbers did you get?