Sql-server – Quorom Configuration for 3 Node Windows cluster

clusteringdisaster recoverysql serversql-server-2012windows-server

We have 2 node onsite and the 3rd node on DR and there is no disk witness. Due to maintenance I set the DR site to be primary and the other 2 nodes to be manual in asynchronous mode and after both the 2 node went down together and the whole cluster went down due to quorum could not achieve disk majority. The Dynamic quorum failed I believe since both the on site 2 nodes went simultaneously.

Since this is a 3 node cluster so the node majority quorum will be used, my question is: if I face similar scenario when the on site 2 node will go down together in that case how can I keep the cluster operational on one node / DR site..

Best Answer

if i face similar scenario when the onsite 2 node will go down together in that case how can i keep the cluster operational on one node / Dr site..

With the current configuration you cannot assuming the 2 nodes simultaneous went down. To avoid this you have to add one more node from the DR side into the WSFC configuration and set up a fileshare witness(preferred). The FS witness should reside on VM or machine which is accessible to both production and DR and should be kept at "third place" which is not affected when either of DC or DR is completely down. I would suggest cloud witness here. Cloud witness do have some lags so please test the scenario with various test cases. With 4 nodes and 1 FS witness even if 2 nodes at DC went down there are still 3 votes from 2 nodes at DR and one FS witness to keep WSFC up and running.

The quorum for this configuration would be Node and Fileshare majority.

Replying to your question from comments

So can i have 2 Node on site , 3rd Node DR and a file share at the DR site. In that case I can survive if both the onsite node goes down.

No WSFC would not survive because there are 4 voting members in WSFC and we need more than 2(More than 50%) to be voting for quorum and WSFC to be online.

Or during the failover before the maintenance can i just remove the 2 onsite nodes from voting rights leaving the DR with 1 vote and thus the cluster will stay alive even the 2 Node onsite went down

Now this is something different from unexpected failover, yes you can remove votes from DR nodes or, if you are using Windows server 2012 r2 and above you can gracefully shutdown the 2 nodes at DC and still the DR node would be online, this is called as last man standing. This is because of dyanamic quorum and dynamic witness functionality of Windows server 2012 r2 server. But please keep in mind, in this scenario you are doing shutdown in planned manner and that is why dynamic quorum and witness are working if the 2 nodes go down unexpectedly nothing would work and whole WSFC would come down.

For your current configuration if suppose due to unexpected DC shutdown your WSFC is down you can use forcequorum to bring the WSFC online at DR node. This method is not complex and if some downtime can be afforded this is useful method. A simple practice and it would hardly take 5-10 mins to bring WSFC online. This will save cost of additional nodes and additional infrastructure.