Sql-server – Help with Replication from an Always On Availability Group

availability-groupsreplicationsql server

I have a two node SQL Server 2016 Multi subnet Availability group. I am trying to setup transnational replication from the AG to a standalone instance for reporting, which is also acting as a remote distributor. We only need a small subset of the tables so adding the standalone instance as a third replica isn't an option.

I've followed the Microsoft Guide for configuring this. I've run the EXEC sys.sp_redirect_publisher and when I run EXEC sys.sp_validate_replica_hosts_as_publishers it returns the AG listener. But when we failover the AG the replication stops working with the following error reporting in replication monitor

Error messages: Unable to query the redirected server
'AGL' for original publisher 'PUB1' and publisher
database 'AlwaysOnTest' to determine the name of the remote server;
Error 7303, Error message 'Error 7303, Level 16, State 1, Message:
Cannot initialize the data source object of OLE DB provider
"SQLNCLI11" for linked server
"[B66E8E27-E422-41DB-BBBC-254D450468D7]".'. (Source: MSSQL_REPL, Error
number: MSSQL_REPL21879) Get help: http://help/MSSQL_REPL21879 Errors
were logged when validating the redirected publisher. (Source:
MSSQL_REPL, Error number: MSSQL_REPL22037) Get help:
http://help/MSSQL_REPL22037

Couple of things I've checked and tried

  1. Checked linked servers for the subscriber exist on all secondary replica hosts.
  2. Switched the security mode to SQL auth using sp_changedistpublisher based on this.

When the AG is failed over if I re-run on the distributor EXEC sys.sp_validate_replica_hosts_as_publishers it returns

Unable to query sys.availability_replicas at the availability group
primary associated with virtual network name 'AGL' for the server
names of the member replicas: error = 7303, error message = Error
7303, Level 16, State 1, Message: Cannot initialize the data source
object of OLE DB provider "SQLNCLI11" for linked server
"[7A7DA613-BFDA-4C51-B62C-9C7CD94E769E]"..',

I am stuck what to try next….help please!!!

UPDATE 16/06/2020

When the AG is failed over I tried connecting to the AG listener from the remote distributor. SSMS connected fine, but when I tried connecting with SQLCMD it returned an error. Also I can connect to the failover instance directly from the distributor, bypassing the listener.

Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : Named Pipes
Provider: Could not open a connection to SQL Server 2. . Sqlcmd:
Error: Microsoft ODBC Driver 13 for SQL Server : Login timeout
expired. Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : A
network-related or instance-specific error has occurred while
establishing a connection to SQL Server. Server is not found or not
accessible. Check if instance name is correct and i f SQL Server is
configured to allow remote connections. For more information see SQL
Server Books Online..

Best Answer

The root cause of this was connection timeouts because the availability group was in a Multi-subnet Availability Group. This was resolved by following this document from Microsoft.