Sql-server – SQL Server 2005 Replication Stops after a minute without an error

sql serversql-server-2005

Background:

I have a SQL Server 2005 setup with master, slave1, slave2 replication set up as a pull replication from slaves. The distribution database resides on the slave1 machine, both slaves pull.

A problem began today where the replication on slave1 simply stops running. It claims that it completed successfully, but it does not restart, and manually starting the process finishes in roughly one minute, again without an error message.

Screenshot

Replication is running fine on slave2, but I can't seem to figure out what's wrong on slave1. I've tried the obvious Windows debugging 101: "restart the machine" technique, but to no avail.

Has anyone encountered this before Does anyone have an idea of what I could check or change to get it working again? I'm especially at a loss as SQL Server claims that the job is just finishing successfully.

Best Answer

SQL Server Replication is a notorious error hider.

Usually if replication just stops, a problem occurred and there is on place where you can see what that problem is. All other places you might look at show no sign of trouble.

Places to include when hunting for errors:

  1. The replication monitor
  2. The job history of each job involved (Look at the history. Even the SQL Agent Monitor hides a replication problem every once in a while.)
  3. The SQL Server error log
  4. The windows event log

3 and 4 are rarely needed. 2 is usually all you need to look at to find the problem if 1 did not help.

There is a lot more information about potential errors and how to identify them in my replication stairway articles. Particularly in Level 10.