Sql-server – Availability Groups Failover Readinness “No data loss” and rollback of nonqualified transactions

availability-groupsfailoverhigh-availabilitysql serversql-server-2016

I have Availability Groups with Synchronous Replication and Automatic Failover setup with SQL Server 2016 SP2 in my production environment.

I want to do a manual failover for patching and I have done this in my dev environment several times. Right now my AG Dashboard shows, Failover Readiness "No Data Loss".

However whenever I do a failover, I still get notifications (a manually setup alert) from SQL Server telling me there were nonqualified transactions rolled back on the server, just like I did when I had Mirroring.

So, is this truly no data loss? It doesn't sound like it. And what exactly are nonqualified transactions vs. just transactions?

Best Answer

This doesn't count as data loss in this context.

From the documentation (emphasis mine):

A manual failover causes a synchronized secondary replica to transition to the primary role after a database administrator issues a manual-failover command on the server instance that hosts the target secondary replica. To support manual failover, the secondary replica and the current primary replica must both be configured for synchronous-commit mode, if any. Every secondary database on the availability replica must be joined to the availability group and synchronized with its corresponding primary database (that is, the secondary replica must be synchronized). This guarantees that every transaction that was committed on a former primary database has also been committed on the new primary database. Therefore, the new primary databases are identical to the old primary databases.

Any transaction in progress while the failover occurs is going to be rolled back and needs to be handled with re-try logic on the application side. You can't have a transaction start on one node, trigger a failover, and then have it finish running on a secondary node - though that would be pretty cool, admittedly.

The synchronous replication just guarantees both nodes are in the same state before/after the failover, otherwise the former primary node could have writes that the now-primary node does not, causing inconsistent data betwixt the two.

Check out the very good documentation for more in-depth information about what really goes on, and what you're actually getting out of the AG feature set.