Sql-server – SQL Server AlwaysOn Error_number 35206

availability-groupsconnectivitysql serversql-server-2012

I've joined a Ecommerce business where they use SQL Server 2012 AlwaysOn for HADR and this is the first time I've come across supporting this technology.

The Sql Server log on both nodes gets spammed almosy daily with the error messages shown below.

What do these errors mean? Are they benign?

I asked the Infrastructure guys to take a look at system and network logs – and they couldnt see anything odd at the times correlating with when the below messages are logged. The AlwaysOn Dashboard records error 35206 against the connection timeout message and the BOL explanation does not make clear whether I should be addressing these timeouts.

Message
A connection timeout has occurred on a previously established connection to availability replica 'SQLWEB02' with id [A794F13A-6FDA-4E7C-B418-A70B320D1AB3]. Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.

Message
AlwaysOn Availability Groups connection with primary database terminated for secondary database 'DB1' on the availability replica with Replica ID: {a794f13a-6fda-4e7c-b418-a70b320d1ab3}. This is an informational message only. No user action is required.

Message
BACKUP failed to complete the command BACKUP LOG DB1. Check the backup application log for detailed messages.

Message
A connection for availability group 'AG2' from availability replica 'SQLWEB01' with id [CC17411E-D3E1-4B32-95D8-8D69BB94E7E0] to 'SQLWEB02' with id [A794F13A-6FDA-4E7C-B418-A70B320D1AB3] has been successfully established. This is an informational message only. No user action is required.

Message
AlwaysOn Availability Groups connection with primary database established for secondary database 'DB1' on the availability replica with Replica ID: {a794f13a-6fda-4e7c-b418-a70b320d1ab3}. This is an informational message only. No user action is required.

Message
Error: 35285, Severity: 16, State: 1.

Message
The recovery LSN (187402:6392:1) was identified for the database with ID 11. This is an informational message only. No user action is required.*

Thanks

Best Answer

These errors mean you had a connection timeout. The subsequent messages are a result of that. In a multi-subnet environment, each AG will have an IP address in each subnet and be registered in DNS. The DNS server returns multiple IP addresses when queried. Only one will be online. SQL tries all the ip address, serially, until connection is established. The TCP timeout is 21 seconds. If the first ip is not online, it will wait 21 seconds before it tries the next one. The default connection timeout period for .NET client libraries is 15 seconds. You can see where that could be a problem. You need to read this article http://blogs.msdn.com/b/alwaysonpro/archive/2014/06/03/connection-timeouts-in-multi-subnet-availability-group.aspx

If you don't have a multi-subnet environment, then I would look at the network logs again and see how long it was taking for your tcp connections.