Sql-server – AlwaysOn Geo Dispersed Automatic Failover

availability-groupssql-server-2017

I'm looking to implement a "not quite" hub and spoke type architecture using AlwaysOn and wondered how the cluster quorum might play into this scenario if a remote site can't connect to the hub where we setup a file share witness or the other nodes in the underlying cluster. If the remote location was the primary for the AG and had a local secondary and also a remote secondary (Corp), would the AG automatically failover to the remote secondary even though there may not be an issue inside the remote location?

For example: The hub we will call corporate and the remote locations we will call stores, all on the same AD domain. I put all twelve stores (two servers each, a primary and a local secondary) and the corporate server (will be a central secondary to each of the remote primary servers) into the cluster together (not using Failover Clustering, just the backbone of AlwaysOn) and setup a file share witness. Now let's say that store 1 looses WAN connectivity back to corporate for a few hours. Since neither of the servers (primary or local secondary) can connect to any other store (node) and the file share witness can't be reached, will the Availability Group automatically failover if I have it set for Automatic Failover and Database Level Health Detection? Or will it understand that it has gotten separated from the cluster and simply do nothing?

I realize this is a bit of a complex use case when it comes to AlwaysOn, but thought I'd see if anyone had knowledge of how this scenario might work.

Best Answer

There is a max of 8 secondary replicas for AlwaysOn in SQL 2016, I do not believe there is a change for that in 2017

Always On Availability Groups on 2016 supports up to eight secondary replicas including one primary replica and two synchronous-commit secondary replicas (no Enhancement on this compared to SQL 2014). Also use SQL Server 2016 Always On Availability Groups to provide high availability for SQL Server databases hosted in Azure, etc. Source

In any case adding a local secondary would not add value from a SQL failover scenario. Only the instance with votes for quorum can be primary, of these there can only be three potential primaries

Another enhancement is an increase in the number of Automatic Failover partners which means you can have two more failover partners apart from the primary. In a three node scenario, if one node fails still we can have high availability with other two nodes. Source

Automatic failover requires synchronous-commit, Spreading synchronous-commit over geographic distance can have an impact on performance.

If you have a 3 node cluster (file share and other quorum considerations excepted ...), any node that can not talk to the at least one of the other two will be down. If all three loose connection, there will be no auto failover (all three will be down). If any two can talk, one of those two will be primary.

Any of the replicas that can communicate with the current primary can have asynchronous copies that are current. So if corporate goes down, one of your two stores will be primary, and you could have several stores with asynchronous read only copies.

Given your scenario you probably want the primary at corporate, and your two synchronous-commit automatic failover partners at the stores that provide the fastest and most dependable connections to corporate.