Sql-server – SQL Server options that influence log buffer flushing or commits-per-second performance

sql serversql server 2014

I am investigating two SQL Server 2014 instances, both running identical software, both running identical hardware (VMs), both loaded with the same test database, but one with hugely greater performance on a microbenchmark.

The benchmark essentially runs the following in a loop:

BEGIN TRANSACTION
    INSERT INTO [tablename] ([data])
    VALUES (CONVERT(NVARCHAR(2000), NEWID()))
COMMIT

One server can achieve approximately 200 transactions per second. The other, approximately 2000 transactions per second.

Since this transaction is doing so little, I hypothesized that the bottleneck could be disk I/O in flushing the log buffer. I thought the DELAYED_DURABILITY option was sure to explain the difference. But alas, it was set to Disabled on both servers. An audit of other options has turned up no other apparent configuration differences.

So my questions are:

  1. Is there any other way that transaction durability could be delayed when that option is set to Disabled, e.g. a master override config file somewhere?
  2. Is there any other way that log buffer flushing could be influenced?
  3. Are there any other settings I should look into that might explain the wide variance in performance that I am seeing?

N.B. I'm aware that this is a microbenchmark and is not representative of the application workflow – however, real-world testing has shown a performance difference too, and this transaction loop seems to be the minimum reproducible example that shows a difference.

Additional details

Looking at the wait stats, WRITELOG is top of the list.

Best Answer

both running identical hardware (VMs)

If the performance of this minimal test is that drastically different, and you have thoroughly confirmed that the software (and instance / database level settings - for example the checkpoint-related settings like indirect checkpoint, recovery interval, target recovery time, etc might affect a benchmark like this one) is the same, then the statement about the hardware being the same is probably not exactly right.

You mentioned WRITELOG is your highest wait. With this workload, WRITELOG will almost inevitably be the highest wait, since that's all you're doing. Is the difference in WRITELOG waits between the two servers large enough to account for the 10x difference in transactions per second?

If that wait is significantly higher on the slow server, then it's possible that, despite the same hardware specs being specified at the VM level, the "path" to the I/O system is slower on the second VM. Or the underlying storage is slower.

  • Check out the sys.dm_io_virtual_file_stats DMV to see if the slow server is experiencing slower writes or more frequent / longer stalls than the fast server.
  • Check the SQL Server error log to see if there are any "SQL Server has encountered X occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [M:\SomeDataFile.mdf] in database [SomeDatabase] (5)." error messages.

If you're seeing slowness reported in those places, you'll need to work with the VM administrator - to see if there is a difference in the underlying storage, or the access to the storage (if it's going out to a SAN).