Sql-server – Communication link failure for some queries to linked server

linked-serverNetworksql-server-2008

I am seeing the following error in SSMS (server has Windows Server 2008 and Microsoft SQL Server 2008 R2 (RTM) – 10.50.1600.1 installed) when running some queries against the linked server, in particular long running queries.

Simple selects from tables across the linked server work fine. This is a new issue that was noticed when SPs that have worked for years started failing.

I have run a Wireshark capture on the server, capturing for packets to port 1433 on the linked server host. At the tail of the capture, I see many (10) TCP Keep-Alives being issued (after a message regarding bad checksum) and then an RST packet. The RST packet is correlated with the error below being returned to the client.

There are other database servers on our network, where the linked server is configured identically, that don't exhibit this issue.

I have found some articles such as this and this. We are using the implicated Broadcom NICs. The Chimney Offload State setting is enabled on the server.

We will try disabling. Other thoughts on troubleshooting would be much appreciated.

OLE DB provider "SQLNCLI10" for linked server "myServer" returned message "Protocol error in TDS stream".
OLE DB provider "SQLNCLI10" for linked server "myServer" returned message "Communication link failure".
Msg 65535, Level 16, State 1, Line 0
Session Provider: Physical connection is not usable [xFFFFFFFF]. 
OLE DB provider "SQLNCLI10" for linked server "myServer" returned message "Communication link failure".
Msg 65535, Level 16, State 1, Line 0
Session Provider: Physical connection is not usable [xFFFFFFFF]. 
OLE DB provider "SQLNCLI10" for linked server "myServer" returned message "Communication link failure".
Msg 64, Level 16, State 1, Line 0
TCP Provider: The specified network name is no longer available.

Best Answer

We disabled TCP Chimney Offload per the Symantec article and rebooted the servers and the issue appears to be resolved.

The problematic SPs are no longer throwing the communication link failure exception.

More info from Microsoft on TCP Offloading