Sql-server – Manually moving SQL instance to another node causes immediate database mirror failover

clusteringmirroringsql serversql-server-2008-r2

I have a two-node SQL cluster (2008 R2).

Some of the databases within that SQL instance are mirrored to another server on a remote site, using the High safety with automatic failover. The mirroring connection timeout value for those databases is set to 90 seconds.

When I move SQL from one node in the cluster to another node, using the Failover Cluster Manager application's option of "Move this service or application to another node" the databases that are mirrored are instantly failing over to the mirror.

This is undesirable behaviour. My reason for setting the mirroring connection timeout value is that I only want to fail over to the database mirror if the cluster fails completely and there are no functioning nodes.

Is there any way to achieve this? It feels as though it should be possible, otherwise the concept of mixing clustering and automatic failover database mirroring would be unworkable as every node failover within the cluster would trigger a mirror failover.

Thanks.

Best Answer

One way to prevent the mirror failover is:

  1. Pause Mirroring with ALTER DATABASE XYZ SET PARTNER SUSPEND
  2. Move the SQL instance
  3. Resume mirroring with ALTER DATABASE XYZ SET PARTNER RESUME

The instance is failing over to the mirror because both the witness and the secondary can no longer see the primary instance.

It sounds like you are attempting to recreate SQL Server 2012 Availability Groups by combining mirroring and clustering.

Database mirroring only times-out on so-called "soft" errors. Hard errors, like a cluster failing over are reported to the mirroring session immediately causing the immediate failover. Read more at http://msdn.microsoft.com/en-us/library/ms190913.aspx

Possible causes of hard errors include (but are not limited to) the following conditions:

A broken connection or wire

A bad network card

A router change

Changes in the firewall

Endpoint reconfiguration

Loss of the drive where the transaction log resides

Operating system or process failure

Conditions that might cause mirroring time-outs include (but are not limited to) the following:

Network errors such as TCP link time-outs, dropped or corrupted packets, 
    or packets that are in an incorrect order.

A hanging operating system, server, or database state.

A Windows server timing out.

Insufficient computing resources, such as a CPU or disk overload, the transaction 
    log filling up, or the system is running out of memory or threads. In these 
    cases, you must increase the time-out period, reduce the workload, or change 
    the hardware to handle the workload. 

For more information about mirroring and potential issues, you might want to see my question What can cause a mirroring session to timeout then failover? SQL Server 2005