Sql-server – SQL Server Database Mirroring : strange client application behavior upon failover

high-availabilitymirroringsql serversql-server-2008-r2

I am putting a HA SQL server environment up based on three SQL Server 2008 R2 machines and database mirroring.

I'll name them :

  • principal.company.intra
  • mirror.company.intra
  • witness.company.intra

The "company.intra" domain is the holding company's domain.

Both database engines are listening on the static 52002 port so client applications access them the following way :

principal.company.intra,52002 & mirror.company.intra,52002

Endpoints are called EP_Mirroring at principal and mirror, EP_Witness at witness and listen on the 5022 port at principal, 5023 at mirror and 5024 at witness.

Service accounts are correctly configured and granted connect permissions on each other's endpoints.

The mirroring feature is working fine, and databases correctly failover in case of TSQL manual failover or simulated system failure.

The problem is with applications behaving strangely upon failover.

The application testing context is the following :

A small .NET app consisting of two textboxes and a button :

When the button is clicked, it makes a stored proc call and fills Textbox1 with the sp's output and Textbox2 with the Data source property of my SqlConnection.

The connection string looks like :

Data source=principal.company.intra,52002;failover partner=mirror.company.intra,52002;
initial catalog = TEST_FAILOVER;user ID=user;password=pass;Connection Timeout=30

I launched this app from my laptop, located in another domain : laptop.childcompany.com

Scenario 1:

  1. I launch the app
  2. Button pressed, TB1 : sp output / TB2 : principal.company.intra,52002
  3. Database Failover
  4. Button pressed, connection timeout
  5. Button pressed, connection timeout
  6. DB Failover at mirror (back to principal)
  7. Button pressed, TB1 : sp output / TB2 : principal.company.intra,52002

Scenario 2:

  1. Database Failover
  2. Launching the app
  3. Button pressed, TB1 : sp output / TB2 : mirror.company.intra,52002
  4. Database Failover at mirror (back to principal)
  5. Button pressed, connection error
  6. Button pressed, TB1 : sp output / TB2 : principal.company.intra,52002
  7. Database Failover
  8. Button pressed, connection timeout
  9. Button pressed, connection timeout

I then tried to launch the app on the mirror server, locally via RDP

Scenario 3

  1. Launching the app
  2. Button pressed, TB1 : sp output / TB2 : principal.company.intra,52002
  3. Database Failover
  4. Button pressed, connection error
  5. Button pressed, TB1 : sp output / TB2 : MIRROR

I checked MSDN for Ole DB connectivity behavior in case of failover to understand why in the third scenario the data source property of my connection was set to MIRROR and not mirror.company.holding,52002

I learned that the failover server property of my connection string is only used in the case of an initial connection to the principal failing (explaining why scenario 2 correctly worked), but that in the case of an existing connection, the principal server has already provided the app with the mirror server address (in something called "failover cache"); the provided information is wrong, no listening port information and host name instead of FQDN thus failing outside the holding company domain.

Next step, I checked the sys.database_mirroring view :

select M.* 
from sys.databases D 
inner join sys.database_mirroring M on D.database_id = M.database_id 
where D.name = 'TEST_FAILOVER'

And noticed that the "mirroring_partner_instance" field containes "MIRROR" instead of "mirror.company.intra,52002".

A quick check to Books Online informed me that this is the field used to inform client apps of the failover partner upon initial connection to the principal DB Engine.

So, finally, my question is :

Is there a way to correct that behavior, and make the "mirroring_partner_instance" field hold the FQDN of the mirror, with the listening port?

At endpoint creation? At mirroring configuration level (via an alter database statement)? DB Engine configuration?

I already successfully tried setting up a SQL Server Native client alias at client computers as a workaround, however I'd really prefer if the principal server returned the correct failover information to client applications.

Best Answer

Suggest you check to see that each box has each FQDN registered with port, etc, for each target/ sql-srv, as well as the witness FQDN. Each server has all three registered with FQDN. Reinitialized the endpoints and cross your fingers ! :) If in DMZ, are you able to use AD inside DMZ ? or maybe there is trust back out the corp AD ? If short names registered and FQDN, think this trust might be giving you fits, so presented with non-FQDN being used. Rick