The reason that this fails outside of the internal Rackspace location is due to the URL Endpoints being set to a value that is not able to be connected through from your local environment.
I discuss this process at a high level in this blog post, however to quickly recap the point that needs to be made here:
The endpoint url is the address where the connection will be routed in order to connect and run their queries. [roughly speaking]
The URL is active directory specific FQDN, so I cannot use it as is
from ourside the the racksapce domain. The ROR URL is some thing like
"tcp://1234-db1.abc.intensive.int:1433"
This means that the client driver is going to be re-pointed to this address; if the address is not reachable then the driver won't be able to connect and you'll have a problem - which you happened to run into.
In your example, the client driver will get back
"tcp://1234-db1.abc.intensive.int:1433" and attempt to connect to it like it would any other instance of SQL Server. Since it can't (as
you've stated) you won't be able to be routed and your connection
should then be on the primary. What I did was I changed the ROR URL
for each node to public IP address instead of the FQDN/ Host-name, so
when the request is sent to AG listener from outside it knows where to
hand-off in the public facing IP. It looks like, if we have to connect
from outside I have to use public IP (Eg: 72.32.XX.XX) as opposed to
tcp://1234-db1.abc.intensive.int:1433 in the ROR.
While the IP may or may not need to be public (different architectures at different companies' may or may not require different things), it seems in your case it does... though I'm not a networking architecture guru so I can't comment on how/why this was implemented or any technical challenges observed.
Also when we deploy the application in the same domain as the db
servers Active Directory domain "abc.intensive.int" would that pose a
problem when apps try to connect to the db AG listener?
The domain shouldn't make a difference here, the real difference is whether or not your DNS is setup where it can do the lookups required to get the information (IIRC this would be a reverse lookup zone but don't quote me on that). If you can resolve the DNS name, it will work fine, if you can't it obviously won't... this is more of a question for your DNS/AD admins as they hold all the information to that kingdom.
Best Answer
As per my understanding, this issue could be due to the multi-threading model or parallelism.
Until SQL Server 2016, the
transaction log redo was handled by single redo thread
for each database, this model is called as serial redo.Starting from SQL Server 2016, the default redo model is parallel redo. It provides multiple threads for transactions and additional worker threads for dirty page flush IO operations.
While working with the concurrent workload,
DIRTY_PAGE_TABLE_LOCK
waits are generated frequently by the worker thread, and at the same time, query threads will try to access the tables, this will cause the performance issues on both SELECT query and redo operations.This issue was fixed on the
latest Cummulative Update
- https://support.microsoft.com/en-in/help/4135048/cumulative-update-1-for-sql-server-2016-sp2As a workaround, you can disable the parallel redo model and enable the serial redo by enabling the trace flag 3459.
I hope this helps! Thank you.