Sql-server – Pausing AlwaysOn Availability Group replication for failover

availability-groupssql serversql server 2014transactional-replication

We are currently implementing a new software package which has one part of its system using database replication (but doesn't support AlwaysOn Availability Groups) replicate from another database that does support AOAG's.

Currently, they only support replication setup via their installer, which creates the entire config on both the publisher and subscriber servers in one go.

It is using transactional replication, and there are both pull and push subscriptions.

The installer creates the distributor on the publisher, which I assume will break replication after a failover and require generation of new snapshots and reinitializion of subscriptions.

Assuming we are doing a planned failover (say for O/S patching), looking to pause replication until we can failover back to the node with the distributor.

Am I on the right track in saying the most reliable method of "pausing" replication would be to use sp_MSstopdistribution_agent on the publisher (for PUSH) and on subscriber (for PULL) before failover to the (non-distributor) node, then do a sp_MSstartdistribution_agent after we fail back again?

We are using SQL 2014 on both publisher and subscriber.

Best Answer

I think you are absolutely right on here:

The installer creates the distributor on the publisher, which I assume will break replication after a failover and require generation of new snapshots and reinitializion of subscriptions.

a good way to relate the replication to the availability group is:

The primary is SP1 and the secondary replica is SS2. The distributor is on another server SD3. The replication is working fine. We redirected the Original Publisher to the AG Listener Name using this script:

USE distribution;
GO  
EXEC sys.sp_redirect_publisher   
    @original_publisher = 'MyPublisher',  
    @publisher_db = 'MyPublishedDB',  
    @redirected_publisher = 'MyAGListenerName'; 

You can find more details on this link:

Replication log reader not updated to new primary after availability group failover

In an AlwaysOn availability group a secondary database cannot be a publisher.

The failover of a distributor on an availability database is not supported.

I would suggest you use another server as a distributor, as you are probably already doing.

I normally use another server as a distributor and several distributor databases depending on the size and how busy your publications are.

I am not entirely sure about the push subscriptions but for the other stuff you could have a look at this link:

Setting up Replication on a database that is part of an AlwaysOn Availability Group

SRV1: Original Publisher

SRV2: Publisher Replica

SRV3: Publisher Replica

SRV4: Distributor and Subscriber (You can choose a completely new server to be the distributor as well, however do not have a distributor on any of the publishers in this case as the failover of a distributor is not supported in this case).

enter image description here

maybe this will not apply on your case, but just for reference:

Configure Replication for Always On Availability Groups (SQL Server)

Am I on the right track in saying the most reliable method of "pausing" replication would be to use sp_MSstopdistribution_agent on the publisher (for PUSH) and on subscriber (for PULL) before failover to the (non-distributor) node, then do a sp_MSstartdistribution_agent after we fail back again?

I would say in most cases yes, I used to use those procedures as you can see (scripts included) on this link:

How to restart the distributor agent of transactional replication?

Now I prefer to use the following method:

-- on the publisher
-- this gives you the distributor server name and distributor database name

sp_helpdistributor


-- now connect to the distributor server and database 
-- get the job name 

-- get the distributor job name for the publication you want
-- in this example my publication is called 'DEOrder'
use distribution
go

my database and publication are called ATOrder in the example below.

-- the snapshot agent
select  job_name=name, publisher_db, publication from  distribution.dbo.mssnapshot_agents
where publisher_db = N'ATOrder'

-- the distributor jon transactional replication
select job_name=name, publisher_db, publication from  distribution.dbo.MSlogreader_agents
where publisher_db = N'ATOrder'

-- if it is merge replication
select  job_name=name, publisher_db, publication  from MSmerge_agents
where publisher_db = N'ATOrder'

-- who are my publishers and who are my subscribers
select * from MSsubscriber_info

-- just checking dbs
select * from MSpublisher_databases



-- get the distributor job (log reader) for a specific publication
select job_name=name, publisher_db, publication from  distribution.dbo.MSlogreader_agents
where publisher_db = N'ATOrder'

-- take note of the job name
--my_publication_server_name_and_instance2-ATOrder-25

-- check that the job is running at the moment - it must be!!
sp_runningjobs 'my_publication_server_name_and_instance2-ATOrder-25'


--check I am targeting the right job
exec msdb.dbo.sp_help_job @job_name = 'my_publication_server_name_and_instance2-ATOrder-25'

--stop the job
exec msdb.dbo.sp_stop_job @job_name = 'my_publication_server_name_and_instance2-ATOrder-25'

--start the job
exec msdb.dbo.sp_start_job @job_name = 'my_publication_server_name_and_instance2-ATOrder-25'

for more information, as to see if everything is going on alright, or who and what is getting a bit slow, you can check this link:

Replication Monitor Information using T-SQL

We are using SQL 2014 on both publisher and subscriber.

Lucky you