SQL Server 2005 – Why Mirrored Databases Change States

mirroringsql-server-2005

Setup:

  • SQL Server 2005 w/SP4 64-bit Standard for all three servers
  • Windows Server 2008 R2 Enterprise w/SP1 64-bit for all three servers
  • 01 is designated as the Principal
  • 02 is designated as the Mirror
  • 03 is designated as the Witness

My issue is that there are Principal and Mirrored DBs on both servers and it is causing Maintenance Plans and SQL scripts to fail.

My questions:

  1. Why are the DBs failing from one to the other when 01 should have all Principal DBs?
  2. How do I stop this behavior
  3. What is the least invasive process of changing the Principal DBs on 02 back to 01?

Best Answer

Can you please look at SQL Server errorlog and see if you could find some relevant information on both Principal and mirror. Generally if SQL Server CPU utilization goes high above 50 % there can be failover. This is specifically caused by heavy load on SQL Server this is documented in Below Microsoft Web resource See 'recommendation for Configuring partner Server' in link : http://msdn.microsoft.com/en-us/library/ms366349%28v=sql.100%29.aspx

This can be avoided by Alter partner time out

Alter database db_name set partner timeout 20

Edit: After User provided more details

Database mirroring connection error 4 'An error occurred while receiving data: '64(The specified network name is no longer available.)'.

As per this message its clear that network connecting principal and mirror flipped I dont know for how much time but this caused the failover. What happens is Mirror server keeps sending pings to Principal whether it is alive or not and suppose Principal goes down and now mirror will send 10 pings to principal(By default) and if it does not gets any response it will initiate failover with help of Witness( its little more complex with witness ,it involves quorum)

If it was for moment you can increase partner timeout( not a good practice though) but you should Immediately speak to network team and ask them RCA for this glitch as this caused failover. My course of action would be to meet them personally ask them to run bunch of commands to check response time