Sql-server – Always On AG is Down when service is stopped

availability-groupsfailoversql serversql server 2014windows-server

I have SQL Server 2014 running on Windows Server 2012 R2.

There are three nodes, one is remote DataCenter. So two nodes have voting.
Failover Cluster is using file share witness on remote server.

If primary server is down, AG automatic Failover works but when SQL Server is down AG automatic failover does not work. AG waits on Resolving state.

Also:

  • I stopped service from configuration manager, AG Failover is success.
  • Stopped service from services.msc, AG Failover is failed.
  • Killed service from Proccess List, AG Failover is failed.

Voting;

Server         Assigned Vote       Current Vote
Node 1              1                   1
Node 2              1                   1
Node 3              0                   0

I disabled firewall for all network, nothing changed, problem continuous.

Can someone help me to solve problem?

Critical Error

The role of this availability replica is unhealthy.
The replica does not have either the primary or secondary role.

Failover is from node1 to node2. Both are set to synchronous and automatic failover. Nodes are not in a paused state. I have installed all nodes as stand alone instances.

Best Answer

It could be the issue described in INF: AlwaysOn – The secondary database doesn’t come automatically when the primary instance of SQL Server goes down by Arvindh Kalidasan - Support Engineer, Microsoft GTSC.

In this blog we would discuss about behavior of AlwaysOn availability group where the secondary database doesn't come automatically when the primary instance goes down. The secondary database goes into Resolving state. On the failover cluster manager the resource appears in fail state.

[...] we found that if we stop SQL Service manually on primary replica, it fails over to second node only once. Any further attempts of stopping SQL Service (to test auto failover) would not cause failover.

The workaround posted there is:

[...] to have the value set to a higher number for "Maximum Failures in the specified period".

  1. Maximum Failures in the specified period: set to 60
  2. Period (Hours): set to 1