Sql-server – Why Regenerate Snapshots For Merge Replication

merge-replicationreplicationsql server

I am in the process of implementing merge replication in SQL 2012 with web sync.

I am wondering two things,

  1. Why is it suggested to regenerate the snapshots every 14 days by
    default?
  2. With web sync you are an anonymous subscriber, so how can replication know when to clean up metadata for a user who never
    syncs

So the first question there. If a person syncs every day, and they are pulling back about the same amount of changes specifically why would we have to regenerate the snapshot? I'm not sure why a sync such as this would become slower?

Is it a case of knowing when it can clean up metadata? The user with the oldest snapshot has just regenerated a new snapshot, so it means we can clean up the metadata up to the next oldest snapshot?

This leads to the second question. If I have a user who I have given a merge replication solution to as a demo. It turns out that they never use the system and have only synced it once at the start to test it out. They may have even removed it from their computer.

If their shapshot job has been turned off and they never sync does it mean we get stuck with a whole bunch of metadata that replication cannot clean up? Does replication at some point conclude that that person is not using the system and block them out?

The reason I ask this is I am using anonymous subscribers. When we have normal subscribers and we delete the subscription the server is connected to directly and the subscription is removed from the publication. This does not happen for web sync.

Best Answer

The assumption I think with the default setting it that you'll be pushing out new subscriptions on a regular basis. You don't need to do this, however when you push out a new subscription you'll need to manually run the snapshot agent (which really isn't that big a deal).

When the subscription is created the distributor will store information about the subscriber so that the distributor has this information available for cleanup and data distribution.