Sql-server – SQL Server 2014: Graceful shutdown of an Availability Group node

availability-groupssql server

What I'm trying to do is to gracefully shut down one of two nodes in our SQL Server Availiability Group, effectively draining all the connections.

Previously, we were just stopping the SQL Server instance on the node, but that results in a flurry of "The specified network name is no longer available" errors from client applications.

What is the kosher way of shutting down a node?

I'm no DBA here, so please, ask whatever questions you deem necessary.

Best Answer

You should failover to one of the other nodes first, before shutting down SQL.

This can be done either in SQL Management Studio (under the Always On High Availability node in the Object Explorer tree), or alternatively using the Windows Server Failover Cluster Manager program, which will be available in the Start Menu of all the Windows servers hosting the SQL instances.

Using SSMS is considered the better approach, as the WSFC utility won't know the synchronisation state of the other replicas.

To do a failover via TSQL instead, connect to the node you want to failover to (not from), and execute this, once for each listener group you have:

T-SQL : ALTER AVAILABILITY GROUP My_Group_Name FAILOVER;

In PowerShell you have this option: Switch-SqlAvailabilityGroup

(Source for code samples)

Related Solutions

SQL Server 2008 – Proper Shutdown of SQL Servers in a Cluster

Does anyone see any issues with that?

You are correct in the steps but your approach is way too much for a little work.

For disk maintenance, why would you shutdown the entire cluster ?

Just suspend that disk (which requires maintenance) and then once it is done resume the node.

Basically, SHUT DOWN CLUSTER will stop all the roles and services on all the nodes of the cluster. The Cluster UI will ensure that all the roles and services are shutdown gracefully.

Seems like you have posted the same question at SQLServerCentral.com

Refer to How to Properly Shutdown a Failover Cluster or a Node

Sql-server – SQL Server 2012 failover cluster node won’t start cluster services

I have had not exactly the same but similar issues described here, here and here.

What can I do to start cluster service on node1 back on?

when a node feels alone in a windows server cluster setting, it loses the quorum.

Then, to start the windows cluster failover service without the quorum, you need to force it to start.

it is very important, because generally you would have something like this:

windows cluster name - the name that all the applications try to connect to: le't's say it is called the_server but actually the_server does not exist, what exist is node1 and node2 (plus maybe a quorum disk or a shared storage for quorum purposes) - so even if you have only one node left, you need the windows cluster failover service running, so that all applications can find the_server

The way I would do that is using powershell:

Import-Module FailoverClusters  

$node = "Always OnSrv02"  
Stop-ClusterNode -Name $node  
Start-ClusterNode -Name $node -FixQuorum  

(Get-ClusterNode $node).NodeWeight = 1  

$nodes = Get-ClusterNode -Cluster $node  
$nodes | Format-Table -property NodeName, State, NodeWeight

this is described in detail here:

Force a windows server failover cluster to start without a quorum

To force a cluster to start without a quorum:

Start an elevated Windows PowerShell via Run as Administrator.
Import the FailoverClusters module to enable cluster commandlets.
Use Stop-ClusterNode to make sure that the cluster service is stopped.
Use Start-ClusterNode with -FixQuorum to force the cluster service to start.
Use Get-ClusterNode with -Propery NodeWieght = 1 to set the value the guarantees that the node is a voting member of the quorum.
Output the cluster node properties in a readable format.

First, get hold of the failover cluster manager application, let's have a look at what we have got:

This is how it looks like:

Check the servers inside the nodes:

This is the normal way to manage the clustering services.

However, when there is no quorum we need to force the service to start. For that you need to follow this link.

just don't forget the way to run the powershell:

The way to run this script is command by command First this:

Import-Module FailoverClusters

Then

$node = "SQLPROD2"  
#Stop-ClusterNode -Name $node  
Start-ClusterNode -Name $node -FixQuorum  

(Get-ClusterNode $node).NodeWeight = 1  

$nodes = Get-ClusterNode -Cluster $node  
$nodes | Format-Table -property NodeName, State, NodeWeight

and all good - with only one node up and running - just until we add a new node (but no downtime and no application is upset and not connecting to the_server):

Best Answer

Related Solutions

SQL Server 2008 – Proper Shutdown of SQL Servers in a Cluster

Sql-server – SQL Server 2012 failover cluster node won’t start cluster services

Related Question