Sql-server – Quorum question for SQL Cluster (Always On)

clusteringhigh-availabilitysql serversql-server-2016

We are looking at setting up an SQL Always On High Availability environment at our organisation.
Our organisation has two datacenters of with one much smaller one but this one's being created to create some high availability.

At our primary site we are looking at installing three servers and one at the secondary site. These server will then be added to a cluster (and we'll add the DB's in some Availability Groups accross these servers)

But here comes the problem. When the primary site is down due to a power outage for example, the whole cluster will be down because the one server doesn't have a majority even with a default quorum. We did also look at adding a quorum to cloud-storage (disk-only quorum setup) but when the internet connection is down (happens sometimes here in Belgium) we also loose the entire cluster.

Does anyone have any experience or a solution for this?

Thanks in advance!

Best Answer

At our primary site we are looking at installing three servers and one at the secondary site. These server will then be added to a cluster (and we'll add the DB's in some Availability Groups accross these servers)

With the current configuration there is no way you can keep WSFC running if primary site, which has 3 nodes out of 4 nodes WSFC, goes down due to power outage or some other disaster. With such configuration adding Fileshare witness would be useless because even if you add one there will be 5 voting members and for WSFC to be online there should be at least 3 online and in your case when primary DC goes down 3 voting members will be down bringing WSFC down.

Does anyone have any experience or a solution for this?

  1. The solution would be to change the architecture, IF you do not want WSFC to come down when primary DC goes down. What you can do is move the 3rd node from primary to secondary DC and configure FC witness in cloud or at some third location which is not secondary DC or Primary DC but accessible to both, so even if primary DC is down there are 3 voting members 2 from secondary DC and one FS witness to keep WSFC up and running.

  2. The other solution is, if you can afford WSFC downtime, let the architecture be as it is and when disaster strikes at primary DC use Forcequorum to bring WSFC online. This seems like bad choice at first but believe me with little practice it works like charm and within 10-15 mins or even less WSFC would be online on DR node and with failover with data loss you could bring database instance online as well. This method is quick with little practice all you need is few commands to be run to bring WSFC and databases online.

Similar readings