Sql-server – Does SQL Server 2016 eliminate poor primary performance when reports take locks on seconday

availability-groupssql serversql-server-2016

In the "Effectively Troubleshooting Latency and Failover of Always On" by Sourabh Agarwal At PASS on Friday, November 3, 2017 3:30" about 30 minutes in he talks about how Synchronous mode commits are quicker, as the hardening is just about getting the t-logs to disk on the secondary. The Redo is an async operation outside of the AlwaysOn commit process.

I may be missing something, but it seems like this "could" mean that the primary OLTP will not be impacted by long running (and blocking) queries on the secondary. I envision that the logs just build in size on the secondary until the blocking report query completes, and then applies all the changes, that have occured on the primary and been hardened to disk on the secondary.

Related Will running a large query on a secondary database in an availability group affect transaction performance in the primary database? In this question the answwer by Kin suggests the answer does not address 2016+

Does SQL Server 2016 eliminate poor primary performance when reports take locks on seconday?

For this question: Assume multiple disks, with data (MDF) and logs (LDF) on seperate drives.

Best Answer

I may be missing something, but it seems like this "could" mean that the primary OLTP will not be impacted by long running (and blocking) queries on the secondary.

There isn't a one sized fits all answer, so assuming that the reporting workload doesn't run the secondary replica out of CPU/Memory/Disk/Network resources and there isn't any contention for said resources then the primary should not be impacted but log growth will occur, which in the end could impact the performance of the AG/SQL.

I envision that the logs just build in size on the secondary until the blocking report query completes, and then applies all the changes, that have occured on the primary and been hardened to disk on the secondary.

That's the gist of it, if you can't truncate and are growing log that may eat up CPU/Disk time that is wasted or cause an issue with the disk growing too much and eating all available space. Don't laugh (or do!), I have witnessed it multiple times - it really does happen.

The other item which isn't discussed is synchronization and redo. Assuming synchronization stays nominal (synchronized) the redo may fall behind. Yes, at some point it might "catch up" but the SLA was already missed. REDO queue build up is a thing in extremely high performant environments and even though SQL Server 2016+ has the potential for parallel redo it doesn't mean it'll always be faster or that the database is using parallel redo. I'm assuming we don't care about automatic failover or failover times (database startup times will be affected).

Does SQL Server 2016 eliminate poor primary performance when reports take locks on seconday?

ASSUMING all of the previous, it acts no differently than 2012/2014 in terms of readable secondary replicas. Yes, there were enhancements but the real reason why there are performance implications on the primary is mostly due to a resource being run out on a secondary and causing long waits which show up in the primary as HADR_SYNC_COMMIT.