Sql-server – How to do failover between AG setup for FCI-FCI servers

availability-groupsclusteringsql serversql-server-2017

Hope I can portrait my question well because i really get confused with AG's 🙂

We have the setup like below-

Windows cluster WKCLU01 with 4 nodes (Node1,2,3,4) and a file share witness in DC3, total 5 votes so ODD# fits in

Out of those 4 node

In DC1 –> SQL FCI Shared storage between Node1 and Node 2 as SQL1\inst1

In DC2 –> SQL FCI Shared storage between Node3 and Node 4 as SQL2\inst2

Now we have to setup AG between DC1 and DC2. below is my understanding:

AG can be setup between 2 replicas here SQL1\inst1 and SQL2\inst2 in ASYNC mode as per limitation and cant use SYNC mode. Assuming this is correct

  1. Is it true automatic failover will happen between each of the FCIs only just like plain old Always on FCI's?

  2. Now how can we do a planned failover between DC1 and DC2 as per our bi-monthly activity? Do we have TSQL or PS to help us automate this. On some msdn links i am confused doing so will cause data loss and somewhere it wont if we change sync mode?

Please suggest

Best Answer

When you said it needs to be set as "async" as per limitation, are you talking about SQL limitation or limitation on your side because of, for example, network latency ?

I'm pretty sure you can build you AG in Sync mode if you want so, I will presume it's limitation on your side.

To answer your questions:

If the primary node for SQL1\inst1 goes down, the cluster will failover to the other node. (just like regular FCI) - Same thing for you second instance in DC2

As you AG will be in Async, the best approch to plan a failover from your DC1 to your DC2 without any data loss would be to
1- Have a maintenance windows where there will not be too much activity
2- Switch your AG to Sync mode and let it sync
3- Once sync (you can check dmv to make sure they are sync), trigger a failover within SQL (Using the Gui in SSMS or TSQL)
4- Once the failover is completed, set the AG back to Async

Note that in Async, you cannot enable Automatic failover so if ever your FCI from DC1 fail (problem with the shared storage for example) while being primary, it won't failover to your DC2 and manually failover to it may cause data loss.