SQL Server Replication – Initialize with Snapshot Using Filtered Rows

sql-server-2008-r2transactional-replication

We have a large database that is being Transactionally replicated (one way) between sites.

The Publication has a date based filter so that it doesn't replicate old data.

I need to add a new subscription to this publication, but don't have the capacity in the transaction log at the new Subscription server to Snapshot all the data and it will take so long to copy the full snapshot over the network that the Subscriber will be millions of transactions behind after initialization.

My question is: does the filter apply to the snapshot?

For instance, the database and all data is on the new subscription server (via Backup/Restore). The filter on the publication (Filter rows) is set to > 2011 on a date field. If I set snapshot going to initialize the subscription is it going to:

  1. Create a snapshot of all data in the Publication articles that is greater than 2011

OR

  1. Create a snapshot of all data in the Publication

Further, if 1 is true, then when it applies the snapshot at the subscriber will it remove all data already in the table (so data added via the initial Backup/Restore operation will be gone. Leaving me with ONLY data > 2011?)

If you can't tell I need all the data in there from the Backup (the pre 2011) but need the snapshot to be small enough that it can initialize the sub within a sensible timeframe (therefore hopefully only the > 2011 data in the snapshot).

Thanks for any assistance.

Best Answer

A lot of testing later, I have the answers.

Does the filter apply to the snapshot?

Yes.

So out of my numbered list, 1 is correct. The reason I didn't really believe this to be true is because when quickly testing this I forgot about the option in Article Properties that controls how the Snapshot is applied. This also made me ask the question after the numbered list beginning 'Further'. The answer to that paragraph is: it depend how it is set in Article Properties. To explain this more fully:

Right click your Publication in SSMS > Properties > click Articles in the left tab > Article Properties button (pick either option that is relevant to you). Look at the Destination Object section > Action if name is in use

This option controls (for this or ALL articles) what the Agent does when applying the Snapshot to the Destination database. As Default it Drops and recreates the object (this is why it seemed to me to be replacing all the data - essentially it was because it was dropping the object, creating the object, BCP'ing the data in (so anything previously in the table is gone)).

I needed "Delete data. If article has a row filter, delete only data that matches the filter" as this therefore kept the data already in the table that was not > 2011.

Hopefully someone in the future can save some time testing by finding this answer! I'm slightly surprised that the default option is to Drop and replace the object to be honest, but I suppose that fits most people's purposes for Snapshot. So someone can find this answer if they are searching google I shall reproduce the question as I would google it:

When initializing replication from Snapshot on a publication with Filtered rows, does it apply the filter to the snapshot it takes?