Sql-server – Quorum disk conundrum

failoversql serversql-server-2008-r2windows-server

I'm having a problem with a 2-node active-active cluster which has been failing over lately but not bringing resources back online.

We've had a Microsoft engineer look at it and, among other things, he noted that it was unusual to have a Quorum disk group explicitly created under the Services and Applications node and thought that this might cause a problem.

I'm a bit puzzled as to whether this is a factor or not. Certainly most of the documentation on quorum disk configurations don't create a specific cluster group to contain the Quorum disk – just going through the Quorum Disk Configuration Wizard seems to be all that is required.

I had always thought of a Quorum disk as being a 'fixed point' – i.e. it doesn't belong to a cluster group and doesn't fail over when a node fails over as to do so might mean the cluster no longer being in quorum (quorate?). However other documents I've read do suggest that the Quorum disk does 'belong' to a node. However if you have an active-active configuration which one gets to own it?

Best Answer

You should not have the quorum disk as a resource inside a "Service or Application". It needs to be configured separately, and should be visible only under the "Disk Witness in Quorum" section in the "Storage" configuration pane.

When the cluster is configured with either "No Majority - Disk Only" or "Node and Disk Majority", the cluster service uses the quorum disk as a "vote" - since the disk can only reside at a single node. If the nodes cannot communicate with each, only a single node will be able to connect to the disk and obtain the required number of votes to achieve quorum. This prevents a partition where both nodes think they can run the services and applications configured in the cluster. If no nodes can connect to the quorum disk, none of the nodes will run the cluster services and applications.

Assigning the quorum disk to a Service or Application means the disk will go offline when a failover occurs. If the disk goes offline, it cannot be used as a vote, and the entire cluster will not operate.

See Windows Server 2008 R2 Cluster Docs and The process of achieving quorum for details

Related Question