AlwaysOn availability group DR failover with Powershell

availability-groupsdisaster recoverypowershell

I am setting up AlwaysOn availability group for Disaster Recovery (DR) in the test environment. For AG1, node1 and node2 (synchronous commit/automatic failover) are in DC1; node 3 and node 4 (Asynchronous commit/manual failover) are in DC2 and file share witness is in DC3. When DR event occurs, 2 nodes in DC1 will be gone. Since DR nodes are set up as an Asynchronous commit, AG status is resolving. I will need to failover manually to bring AG group to a synchronized state. I want to use powershell to do failover with data loss. When I run the following powershell to failover, I am getting the error.

Switch-SqlAvailabilityGroup -Path SQLSERVER:\Sql\SQLDRPOCA03\DEFAULT\AvailabilityGroups\SQLDRPOCAG

Switch-SqlAvailabilityGroup : The availability replica for availability group 'SQLDRPOCAG' on this instance of SQL Server cannot become the primary replica. One or more databases are not synchronized or have not joined the availability group.

If the availability replica uses the asynchronous-commit mode, consider performing a forced manual failover (with possible data loss). Otherwise, once all local secondary databases are joined and synchronized, you can perform a planned manual

failover to this secondary replica (without data loss). For more information, see SQL Server Books Online.

Is it possible to use powershell to do manual failover?

Best Answer

I think what you are trying to achieve via powershell is for planned manual failover as mentioned in msdn link. here

A planned manual failover is supported only when the primary replica and the target secondary replica are running in synchronous-commit mode and are currently synchronized. A planned manual failover preserves all the data in the secondary databases that are joined to the availability group on the target secondary replica

And what you exactly are trying to achieve " failover between async replicas " is more of forced manual failover with/without data loss which can be achieved via TSQL as mentioned in same msdn link. I don't see PS there but may be someone here who have used can answer. I am not sure but may be you can check dbatools if it provides one

Go through the steps as mentioned laster in that article

Manual failover without data loss

Use this method when the primary replica is available, but you need to temporarily or permanently change the configuration and change the SQL Server instance that hosts the primary replica. To avoid potential data loss, before you issue the manual failover, ensure that the target secondary replica is up to date.

Related Solutions

Sql-server – AlwaysOn Availability Group Automatic Failover does not work

If I disconnect DEV-AWEB5

Define "disconnect", if you will. My guess is you kept the box up but took SQL Server down.

I cannot connect to the Group Listener (DevListener), but I can ping it and it will respond to my ping

That's because the listener is just a virtual network name (VNN) within the WSFC cluster resource group for the represented availability group. Your DEV_AWEB5 node still owns the cluster resource group, but it's just the AG cluster resource most likely that is in a failed state. The VNN must still be online (expected behavior). It's simply pointing to whatever node is owning that resource group (in this case, DEV-AWEB5). In fact, if you had PowerShell remoting enabled, and you ran the following:

Invoke-Command -ComputerName "YourListenerName" -ScriptBlock { $env:computername }

Likewise, if you can RDP into DEV-AWEB5 (provided you have the capability and accessibility, etc.) then you'd be able to RDP using the listener name (mstsc /v:YourListenerName). It's just a VNN.

The return of that would be the computer name of your owning node.

By all of your symptoms, I'd be willing to bet that you've reached your failover threshold. The failover threshold determines how many times the cluster will attempt to failover your resource group in a specified time period. The default of these values max failovers n - 1 (where n is the count of nodes) in a period of 6 hours. You can see that through the following WSFC PowerShell command:

Get-ClusterGroup -Name "YourAgName" |
    Select-Object Name, FailoverThreshold, FailoverPeriod

That just gives you the settings (which you can modify if you so choose, of course).

The best way to prove that this is the case for you, you would need to generate the cluster log (the system event logs only go into detail as far as " has failed", or something like that).

Get-ClusterLog -Node "YourClusterNode" -TimeSpan <amount_of_minutes_since_failure>

That'll by default get put into the "C:\Windows\Cluster\Reports" folder, and the file is called "Cluster.log".

If you were to open up that cluster log, you should be able to find the following string in there, indicating exactly what happened and why it happened:

Not failing over group [YourClusterGroupName], failoverCount [# of failovers], failover threshold [failover threshold value], nodeAvailCount [node available count].

The above message is simply WSFC telling you that it will not failover your group because it's happened too much (you hit the threshold).

Why does this happen? Simply to prevent the Ping-Pong effect of cluster resources going back and forth too frequently between nodes.

Whereas this would be common to hit these thresholds in failover testing, in production it would typically point to a problem that should be investigated.

Is it possible to use AAG with two 2-Node Clusters across two datacenters

Simply put, if you have, or can create, a Windows Server Failover Cluster setup across the data centres as a stretch cluster, then there is no reason why this is won't be possible. The fact the DC's are geographically dispersed is irrelevant. Good networking can bring them together as if they were not.

You could also do this as a SQL Cluster, with replication across the disks, and the disks presented to the cluster as a shared cluster resource.

I think you need to read up on the Pro's and Con's of each solution, and maybe speak to a Windows Server and/or Networking professional.

Best Answer

Related Solutions

Sql-server – AlwaysOn Availability Group Automatic Failover does not work

Is it possible to use AAG with two 2-Node Clusters across two datacenters

Related Question