Sql-server – How to compare the server impact of transaction replication

performancereplicationsql serversql-server-2008-r2

We are developing a system that must extract data from an existing SQL Server 2008 R2 cluster. One of our design goals is to limit the performance impact on that cluster as much as is practical.

Our system must have access to data that is < 1 minute old. So we are considering both a straightforward SELECT from the source clusterexecuted at regular intervals (say every 20 seconds), or alternatively, establishing transaction replication from the source cluster to our own system for subsequent processing.

We have a test instance of the cluster, and I would appreciate any recommendations on how to measure impact on the source cluster for each of these functions for the purposes of making an architectural decision.

I'm familiar with the use of the profiler, and that seems like a reasonable tool for analyzing the impact of the SELECTs, but I am running into difficulty when trying to find similar impact data for replication.

Best Answer

Our system must have access to data that is < 1 minute old.

alternatively, establishing transaction replication from the source cluster to our own system for subsequent processing.

Transactional replication will give you the ability of having a near-to real time data at the subscribers.

It comes at the cost, as during the snapshot, the entire table is locked and if the table is huge, then there will be prolonged blocking happening. As an alternative, you can initialize T-Rep from a backup.

Best is to use some stress utility to simulate a workload and then see the latency that is incurred on the T-Rep.

It also makes a big difference whether you are replicating your entire database vs a subset of tables, whether the distribution is on the same server as publisher server, the n/w latency between publisher and subscriber and most obviously the hardware.

It will be tricky when you have to add new table to the publisher or do some schema modification on the publisher - as it will require a snapshot to be generated. Also, be careful with Identity columns when implementing T-Rep.

Below are some links/references that will get you started.