Sql-server – SQL Server : Job not reconnecting to restarted Linked Server

sql serversql-server-2012sql-server-agentt-sql

The problem

To keep two tables across two servers in sync, we have a trigger that adds a change to a MergeTable. Every so often, a merging job runs on SQL Server to push the changes to the Progress linked server.

I was testing connectivity issues when I ran into this problem:

  1. shut down Progress DB
  2. job fails as expected
  3. start up Progress DB
  4. job continues to not connect to Progress DB until manually disabled and re-enabled

I get many multiples of these three messages:

OLE DB provider "MSDASQL" for linked server "PROGRESS_TEST" returned message "[DataDirect][ODBC Progress OpenEdge Wire Protocol driver][OPENEDGE]Server crash likely.". [SQLSTATE 01000]

OLE DB provider "MSDASQL" for linked server "PROGRESS_TEST" returned message "[DataDirect][ODBC Progress OpenEdge Wire Protocol driver]Socket closed.". [SQLSTATE 01000]

OLE DB provider "MSDASQL" for linked server "PROGRESS_TEST" returned message "[DataDirect][ODBC Progress OpenEdge Wire Protocol driver]Unexpected Network Error. ErrNum = 10054". [SQLSTATE 01000]

What I've tried

I can execute sp_testlinkedserver and get an error, so I can detect WHEN it happens:

Msg 7303, Level 16, State 1, Procedure sp_testlinkedserver, Line 1
Cannot initialize the data source object of OLE DB provider "MSDASQL" for linked server "PROGRESS_TEST".

I attempted to call sp_dropserver, sp_addlinkedserver, and sp_testlinkedserver in quick succession, which does not throw an error, but the job thinks it successfully completes when it didn't.

I'm quite new to SQL/T-SQL. There's no sp_reconnectlinkedserver, and I can't replicate this issue outside the job; running the individual procedures works when the database gets back up. I'm thinking this is a session issue, but I wouldn't know where to begin fixing it.

Best Answer

Querying the linked server seems to get it to reconnect. Calling sp_testlinkedserver doesn't always work, but a select statement will.

Sometimes there is a two minute lag from the start of the downed server to the next successful run of the job. In any case, it's simply a matter of waiting.