Sql-server – How to compare the server impact of transaction replication

performancereplicationsql serversql-server-2008-r2

We are developing a system that must extract data from an existing SQL Server 2008 R2 cluster. One of our design goals is to limit the performance impact on that cluster as much as is practical.

Our system must have access to data that is < 1 minute old. So we are considering both a straightforward SELECT from the source clusterexecuted at regular intervals (say every 20 seconds), or alternatively, establishing transaction replication from the source cluster to our own system for subsequent processing.

We have a test instance of the cluster, and I would appreciate any recommendations on how to measure impact on the source cluster for each of these functions for the purposes of making an architectural decision.

I'm familiar with the use of the profiler, and that seems like a reasonable tool for analyzing the impact of the SELECTs, but I am running into difficulty when trying to find similar impact data for replication.

Best Answer

Our system must have access to data that is < 1 minute old.

alternatively, establishing transaction replication from the source cluster to our own system for subsequent processing.

Transactional replication will give you the ability of having a near-to real time data at the subscribers.

It comes at the cost, as during the snapshot, the entire table is locked and if the table is huge, then there will be prolonged blocking happening. As an alternative, you can initialize T-Rep from a backup.

Best is to use some stress utility to simulate a workload and then see the latency that is incurred on the T-Rep.

It also makes a big difference whether you are replicating your entire database vs a subset of tables, whether the distribution is on the same server as publisher server, the n/w latency between publisher and subscriber and most obviously the hardware.

It will be tricky when you have to add new table to the publisher or do some schema modification on the publisher - as it will require a snapshot to be generated. Also, be careful with Identity columns when implementing T-Rep.

Below are some links/references that will get you started.

Related Solutions

Sql-server – Can two servers access the same database files

Two distinct SQL Server machines can't share the same database files. Except when we're speaking about SQL Server 2008 Failover Clustering.

References here:

MSDN: Getting Started with SQL Server 2008 R2 Failover Clustering
MSDN: How to: Create a New SQL Server Failover Cluster (Setup)
our own MrDenny: SQL Server 2008 Clustering
Brad McGehee: An Introduction to SQL Server Clustering
Advantages And Disadvantages of Clustering SQL Server
How to Cluster SQL Server 2005

PS: while I use Windows 7 on the work and home machines I wouldn't advice having it as a Operating System for a database server. I think that MS Windows Server 2008 R2 should be the choice in this case.

SQL Server – Many-to-One Replication

I have 8 individual SQL Server 2008 R2 machines, each hosting 1 database. Each database has an identical table structure and schema, and entirely unique data. I would like to establish a reporting server (may be 2008 or 2012), that consolidates the rows from selected tables across the 8 source servers into a single instance of those tables on the reporting server. This is one-way replication (no changes will be made to the reporting server). I will need to replicate changes from the source databases with relatively low latency (say 20-30 seconds).

You can achieve this with Transactional replication. Below is how you can do it.

Note : You have to change slightly your table schema to achieve this as you have to uniquely identify that rows when you are replicating to the subscriber. As a prerequisite of T-Rep you need to have tables with PK defined.

Below is your sample table on Publisher servers that is on all your 8 servers that you want to consolidate rows on the reporting server :

CREATE TABLE Products
(
ProductID INT not null,
ProductName VARCHAR(25),
ServerName sysname default @@servername not null -- this is to identify which row is from which server ; probably add this using Alter column
)
GO
ALTER TABLE Products
ADD CONSTRAINT pk_Product_ID_ServerName PRIMARY KEY (ProductID)

On the subscriber server, you need to create the same table but with different PK to uniquely identify the rows at subscriber (not doing so, T-Rep is going to fail with PK violation - I am assuming that you cannot modify the PK structure on live PRODUCTION rather its better to modify at the subscriber)

CREATE TABLE Products
(
ProductID INT not null,
ProductName VARCHAR(25),
ServerName sysname default @@servername not null
);
GO

ALTER TABLE Products
ADD CONSTRAINT pk_Product_ID_ServerName PRIMARY KEY (ProductID,ServerName)

Below script will help you setting up T-Rep, just change the databasename, destination server name along with object name.

-- Enabling the replication database
use master
exec sp_replicationdboption @dbname = N'repl1', @optname = N'publish', @value = N'true'
GO

exec [repl1].sys.sp_addlogreader_agent @job_login = null, @job_password = null, @publisher_security_mode = 1
GO
exec [repl1].sys.sp_addqreader_agent @job_login = null, @job_password = null, @frompublisher = 1
GO
-- Adding the transactional publication
use [repl1]
exec sp_addpublication @publication = N'repl1_2005', @description = N'Transactional publication of database ''repl1'' from Publisher ''server_name\SQL2005''.', @sync_method = N'concurrent', @retention = 0, @allow_push = N'true', @allow_pull = N'true', @allow_anonymous = N'false', @enabled_for_internet = N'false', @snapshot_in_defaultfolder = N'true', @compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous', @allow_subscription_copy = N'false', @add_to_active_directory = N'false', @repl_freq = N'continuous', @status = N'active', @independent_agent = N'true', @immediate_sync = N'false', @allow_sync_tran = N'false', @autogen_sync_procs = N'false', @allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1, @allow_initialize_from_backup = N'false', @enabled_for_p2p = N'false', @enabled_for_het_sub = N'false'
GO


exec sp_addpublication_snapshot @publication = N'repl1_2005', @frequency_type = 1, @frequency_interval = 0, @frequency_relative_interval = 0, @frequency_recurrence_factor = 0, @frequency_subday = 0, @frequency_subday_interval = 0, @active_start_time_of_day = 0, @active_end_time_of_day = 235959, @active_start_date = 0, @active_end_date = 0, @job_login = null, @job_password = null, @publisher_security_mode = 1
exec sp_grant_publication_access @publication = N'repl1_2005', @login = N'sa'
GO
exec sp_grant_publication_access @publication = N'repl1_2005', @login = N'NT AUTHORITY\SYSTEM'
GO
exec sp_grant_publication_access @publication = N'repl1_2005', @login = N'BUILTIN\Administrators'
GO
exec sp_grant_publication_access @publication = N'repl1_2005', @login = N'server_name\SQLServer2005SQLAgentUser$server_name$SQL2005'
GO
exec sp_grant_publication_access @publication = N'repl1_2005', @login = N'server_name\SQLServer2005MSSQLUser$server_name$SQL2005'
GO
exec sp_grant_publication_access @publication = N'repl1_2005', @login = N'distributor_admin'
GO

-- Adding the transactional articles
use [repl1]
exec sp_addarticle @publication = N'repl1_2005', @article = N'Products', @source_owner = N'dbo', @source_object = N'Products', @type = N'logbased', @description = N'', @creation_script = N'', @pre_creation_cmd = N'none', @schema_option = 0x000000000803509F, @identityrangemanagementoption = N'none', @destination_table = N'Products', @destination_owner = N'dbo', @status = 24, @vertical_partition = N'false', @ins_cmd = N'CALL [sp_MSins_dboProducts]', @del_cmd = N'CALL [sp_MSdel_dboProducts]', @upd_cmd = N'SCALL [sp_MSupd_dboProducts]'
GO

-- Adding the transactional subscriptions
use [repl1]
exec sp_addsubscription @publication = N'repl1_2005', @subscriber = N'server_name\SQL2008R2', @destination_db = N'repl123', @subscription_type = N'Push', @sync_type = N'automatic', @article = N'all', @update_mode = N'read only', @subscriber_type = 0
exec sp_addpushsubscription_agent @publication = N'repl1_2005', @subscriber = N'server_name\SQL2008R2', @subscriber_db = N'repl123', @job_login = null, @job_password = null, @subscriber_security_mode = 1, @frequency_type = 64, @frequency_interval = 1, @frequency_relative_interval = 1, @frequency_recurrence_factor = 0, @frequency_subday = 4, @frequency_subday_interval = 5, @active_start_time_of_day = 0, @active_end_time_of_day = 235959, @active_start_date = 0, @active_end_date = 0, @dts_package_location = N'Distributor'
GO

Couple of points to note :

In sp_addsubscription make sure that @sync_type = N'automatic'

And the article properties should be set to :

enter image description here

So finally, you can have rows consolidated from all (in my case 3 servers) as below :

enter image description here

So in summary,

Use T-Rep.
Add an additional column to the existing Publisher databases e.g. serverName to uniquely identify the rows at the subscriber.
Create table on Subscriber having PK included as ServerName.
Create replication of the tables with @sync_type = N'automatic' and Article property set to "Keep existing object unchanged".
Run snapshot agent.
Check the consolidated data on the subscriber.

Best Answer

Related Solutions

Sql-server – Can two servers access the same database files

SQL Server – Many-to-One Replication

Related Question