Sql-server – SQL Server transaction replication agent error

replicationsql serversql-server-2008-r2

We are running transaction replication on SQL Server 2008 R2 SP3. Distribution agents are reporting :

"Agent is retrying after an error. 3 retries attempted"

every 10 minutes. Tried changing the query timeout values, restarted SQL Server agent and stopped and restarted the distribution agent. There are no errors in the job log, SQL Server error logs or windows event logs that appear relevant.

Best Answer

Note: This answer is based on OP's comment under the question. So credit goes to them for finding the solution.

Faced this issue my self, which is how I ended up here. Like @PMSawyer said, there was a negative value in the run_duration column of sys.jobhistory for the Agent job.

The following query can be used to find if there are any negative values.

SELECT * FROM msdb.dbo.sysjobhistory WHERE job_id = ( SELECT job_id FROM msdb.dbo.sysjobs WHERE name = 'INPUT_YOUR_AGENT_JOB_NAME_HERE' ) AND run_duration < 0 ORDER BY run_date DESC

If this query produces an output, it is very likely that that is what is causing the problem. The negative value will need to be removed by purging the agent job's history, the following query can be used for that.

msdb.dbo.sp_purge_jobhistory @job_name = 'INPUT_YOUR_AGENT_JOB_NAME_HERE'

The output would look something like: 12 history entries purged..

Purging the job history doesn't affect replication in any way, in case that's a concern.

Also, in my case the reason this negative value came to be was that the server time had changed in accordance with Day Light Savings. The server clock and moved an hour behind.

On an unrelated note, I don't know fancy techniques like linking to OP's comment that provided the basis for this answer, if anyone cares enough to do that, it'd be great.