SQL Server – Troubleshooting Read-Only Routing in Availability Groups

availability-groupssql serversql-server-2017

First off, my apologies if this has already been answered elsewhere. I have not been able to find any articles or threads describing my exact situation, but no one's Google-fu is perfect.

With that out of the way, here's my situation: I have a SQL Server 2017 (14.0.1000.169) Enterprise environment with three AG replicas (one primary, one secondary with synchronous commit, and one secondary with asynchronous commit), and whenever I try to connect to the Listener using ApplicationIntent=ReadOnly and specifying an AG database, it gives me the following error:

A network-related or instance-specific error has occurred while
establishing a connection to SQL Server. Server is not found or not
accessible. Check if instance name is correct and if SQL Server is
configured to allow remote connections.

It connects with no issues if I don't specify the ApplicationIntent flag, or if I don't specify a database. Here are the things I have already checked:

  • Listener is configured, and can be connected to normally.
  • Read-Only Routing URLs are configured for each replica, using the FQDN and TCP port for that instance (ran the script provided here to verify that I have those correct).
  • Read-Only Routing Lists are configured for each replica.
  • The sync secondary is in Synchronized status, and the async secondary is in Synchronizing status.
  • The secondary replicas are set to allow all connections (ALLOW_CONNECTIONS = ALL) and can be queried directly using their own instance names.

Each SQL Server instance that hosts a replica is using a different TCP port from the others for both its Endpoint URL and its Read-Only Routing URL (six different ports between the three instances), and the Listener is listening on a different port from all of those. I don't think this has anything to do with the problem, but I'm mentioning it for the sake of thoroughness.

Like I said above, I haven't found anything useful for my exact situation, since all the results I've been able to find have just been people who haven't set up their routing lists or whatever. Any ideas on next steps here or anything I might be missing? Please let me know if you need any additional info about the environment.

Best Answer

The human element was the problem, as usual. Turns out I had accidentally used the primary replica's server name in the Read-Only Routing URLs for all three replicas. There being only one character difference between each name made it difficult to spot, even though I'd looked at them dozens of times.

Thanks to @SeanGallardy for prompting me to pull the URLs, which caused me to notice the problem.

Thanks to @clifton_h for also making a good-faith effort to help. Y'all good people!