I have had not exactly the same but similar issues described here, here and here.
What can I do to start cluster service on node1 back on?
when a node feels alone in a windows server cluster setting, it loses the quorum.
Then, to start the windows cluster failover service without the quorum, you need to force it to start.
it is very important, because generally you would have something like this:
windows cluster name - the name that all the applications try to connect to: le't's say it is called the_server
but actually the_server does not exist, what exist is node1 and node2 (plus maybe a quorum disk or a shared storage for quorum purposes) - so even if you have only one node left, you need the windows cluster failover service running, so that all applications can find the_server
The way I would do that is using powershell:
Import-Module FailoverClusters
$node = "Always OnSrv02"
Stop-ClusterNode -Name $node
Start-ClusterNode -Name $node -FixQuorum
(Get-ClusterNode $node).NodeWeight = 1
$nodes = Get-ClusterNode -Cluster $node
$nodes | Format-Table -property NodeName, State, NodeWeight
this is described in detail here:
Force a windows server failover cluster to start without a quorum
To force a cluster to start without a quorum:
- Start an elevated Windows PowerShell via Run as Administrator.
- Import the FailoverClusters module to enable cluster commandlets.
- Use Stop-ClusterNode to make sure that the cluster service is stopped.
- Use Start-ClusterNode with -FixQuorum to force the cluster service to start.
- Use Get-ClusterNode with -Propery NodeWieght = 1 to set the value the guarantees that the node is a voting member of the quorum.
- Output the cluster node properties in a readable format.
First, get hold of the failover cluster manager application, let's have a look at what we have got:
This is how it looks like:
Check the servers inside the nodes:
This is the normal way to manage the clustering services.
However, when there is no quorum we need to force the service to start.
For that you need to follow this link.
just don't forget the way to run the powershell:
The way to run this script is command by command
First this:
Import-Module FailoverClusters
Then
$node = "SQLPROD2"
#Stop-ClusterNode -Name $node
Start-ClusterNode -Name $node -FixQuorum
(Get-ClusterNode $node).NodeWeight = 1
$nodes = Get-ClusterNode -Cluster $node
$nodes | Format-Table -property NodeName, State, NodeWeight
and all good - with only one node up and running - just until we add a new node (but no downtime and no application is upset and not connecting to the_server):
One of the requirements for AGs is that all instances are participating nodes in the same WSFC. I believe, based on what youa re telling us, is that the 3 instances are, in fact, in 3 separate clusters.
I don't believe that what you are trying to accomplish is possible.
Best Answer
to solve the issue do an edition upgrade of sql server on the node 2.
https://docs.microsoft.com/en-us/sql/database-engine/install-windows/upgrade-to-a-different-edition-of-sql-server-setup?view=sql-server-2017
It should not even require the service to be stopped. At the end you have to restart the node 2