SQL Server Always On – Listener Behavior and DB Removal

availability-groupssql server

We have multiple multi-site always on failover cluster instances in our env. We have an enforced internet outage being imposed on us and at the time the link between the data centers will be down for up to 6 hours…

Our plan is to remove the Databases from the AGs on the primary instances, but leaving the listener in place so the applications should still connect to the DBs via the VIP/DNS.

We have tested this and did not encounter any issues.

However we are struggling to find any information to say that this approach is best practice. We cannot suspend movement as the outage could be long and MS recommend movement is suspended for a short period only.

Anyone else had to deal with this situation? If yes – is our approach feasible?

Once the outage is complete we will disable log backups on the Primary nodes, delete the non-synched DBs on the secondary nodes and then add them all back in on the primary.

Any thoughts or input would be greatly appreciated.

Best Answer

Our plan is to remove the Databases from the AGs on the primary instances, but leaving the listener in place so the applications should still connect to the DBs via the VIP/DNS.

We cannot suspend movement as the outage could be long and MS recommend movement is suspended for a short period only.

I would not remove the databases on the primary node from the availability group.

I would, however, remove the affected secondary replicas from the availability group. Removing them would accomplish two things:

Since these are secondary replicas the databases will return to a restoring state. This will be helpful in the future.
Allows the primary and any unaffected secondary replicas to stay in the availability group and continue with log backups and re-use.

During the time the secondary replicas are removed from the AG, continue to take log backups as normal. This will facilitate the reuse of the log so that it doesn't grow out of control. Keep these log backups handy and ready for action.

Once the affected secondary replicas are no longer affected, copy all of the log backups taken while the secondary replicas were out of the AG and apply the log backups to those databases. When applying the log backups make sure to keep the databases in a restoring state by choosing WITH NORECOVERY on each log restore.

Finally, suspend the log backups and restore any final ones that were taken while restoring the older ones. This will bring the databases on the previously removed secondary replicas to the same time frame as the primary and any other secondary replicas.

Once the final log backup has been applied and the databases still left in a restoring state, add the replicas back into the AG. When this happens, since the databases are still in a restoring state and have been restored to the last log backup the AG will be able to join the replicas and databases without issue. There will be a short period of time where the replicas will need to catch up.

Once the secondary replicas and databases are rejoined, resume log backups as normal.

This would be the ideal process as it keeps your AG intact (for any unaffected secondary replicas), continues to leverage the listener for your applications, can still provide HA and to some extent DR depending upon the replicas available, continues to allow for log backups and re-use, stays transparent to the end user.

Best Answer

Related Solutions

Sql-server – SQL Server 2012, Network for: AlwaysOn Availability Group with AlwaysOn AlwaysOn Failover Cluster Instances

Sql-server – Failover Cluster and AlwaysOn Availability Groups across Data Centers

Related Question