There are several steps to configuring a server to accept ReadOnly traffic. The following link walks you through it, http://msdn.microsoft.com/en-us/library/hh710054.aspx ,but basically you need to configure each server in the AG and then set up the routing for each.
Here's the T-SQL involved:
ALTER AVAILABILITY GROUP [AG1]
MODIFY REPLICA ON
N'COMPUTER01' WITH
(SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY));
ALTER AVAILABILITY GROUP [AG1]
MODIFY REPLICA ON
N'COMPUTER01' WITH
(SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://COMPUTER01.contoso.com:1433'));
ALTER AVAILABILITY GROUP [AG1]
MODIFY REPLICA ON
N'COMPUTER02' WITH
(SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY));
ALTER AVAILABILITY GROUP [AG1]
MODIFY REPLICA ON
N'COMPUTER02' WITH
(SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://COMPUTER02.contoso.com:1433'));
ALTER AVAILABILITY GROUP [AG1]
MODIFY REPLICA ON
N'COMPUTER01' WITH
(PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('COMPUTER02','COMPUTER01')));
ALTER AVAILABILITY GROUP [AG1]
MODIFY REPLICA ON
N'COMPUTER02' WITH
(PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('COMPUTER01','COMPUTER02')));
GO
Sounds like you may be missing the configuration and/or routing information for the primary.
Our plan is to remove the Databases from the AGs on the primary
instances, but leaving the listener in place so the applications
should still connect to the DBs via the VIP/DNS.
We cannot suspend movement as the outage could be long and MS
recommend movement is suspended for a short period only.
I would not remove the databases on the primary node from the availability group.
I would, however, remove the affected secondary replicas from the availability group. Removing them would accomplish two things:
- Since these are secondary replicas the databases will return to a
restoring
state. This will be helpful in the future.
- Allows the primary and any unaffected secondary replicas to stay in the availability group and continue with log backups and re-use.
During the time the secondary replicas are removed from the AG, continue to take log backups as normal. This will facilitate the reuse of the log so that it doesn't grow out of control. Keep these log backups handy and ready for action.
Once the affected secondary replicas are no longer affected, copy all of the log backups taken while the secondary replicas were out of the AG and apply the log backups to those databases. When applying the log backups make sure to keep the databases in a restoring state by choosing WITH NORECOVERY
on each log restore.
Finally, suspend the log backups and restore any final ones that were taken while restoring the older ones. This will bring the databases on the previously removed secondary replicas to the same time frame as the primary and any other secondary replicas.
Once the final log backup has been applied and the databases still left in a restoring state, add the replicas back into the AG. When this happens, since the databases are still in a restoring state and have been restored to the last log backup the AG will be able to join the replicas and databases without issue. There will be a short period of time where the replicas will need to catch up.
Once the secondary replicas and databases are rejoined, resume log backups as normal.
This would be the ideal process as it keeps your AG intact (for any unaffected secondary replicas), continues to leverage the listener for your applications, can still provide HA and to some extent DR depending upon the replicas available, continues to allow for log backups and re-use, stays transparent to the end user.
Best Answer
You don't need to suspend data movement, but otherwise what you've written for option #1 is a good approach! Before doing anything, I would disable automatic failovers on all replicas, to prevent any unexpected failovers when you bring the AG back up.
Then, as you said, shut down the secondaries first, then shut down the primary.
When you bring the primary back up, it should come online as the primary again. Then bring up the secondaries.
Make sure to turn automatic failovers back on once everything is stable, if you had it on in the first place.