Sql-server – How to automatically stagger transaction log shipping to minimize bandwidth usage peaks

backupsql serversql-server-2008

The following question pertains to Microsoft SQL Server transactional log shipping (TLS).

We're using SQL Server 2008 R2 SP1, although the question is probably relevant for all recent editions.

Background

I have a primary data center (A) and a secondary, disaster recovery data center (B). Let's say I have 64 databases that need to be log shipped of varying sizes, but no one database exceeds 50GB.

I'm currently using the default SQL Agent Jobs that SQL Server automatically creates, so I have 64 SQL Agent Jobs.

I want to log ship every database every 15 minutes. Assume that's possible to do if I just transferred all the files back-to-back.

Issue

Let's also say that the path from A to B travels over the public internet, and so I am paying for the bandwidth to ship these logs. I'm metered at the 95th percentile. I want to smooth out or optimize bandwidth so as to minimize the chance of paying overages. I'm using backup compression.

Possible Resolution

My current idea is to write a script that will customize the start time of each backup job automatically. This script could be run as frequently as desired to maintain optimal transfers. Having it run regularly will enable new log shipping backup jobs for newly added databases to be optimized without human intervention.

The current SQL Agent Jobs all start with the string "LSBackup" such that I can list them all using:

SELECT * FROM msdb..sysschedules WHERE name LIKE 'LSBackup%'

I can fetch the list of all the SQL Agent Jobs and store them in a table variable, then iterate in a WHILE loop calling EXEC msdb.dbo.sp_update_schedule to update the @active_start_time to space the jobs appropriately.

What value should I use for the start time for each job?

In order to answer this I need to know how big the log backup file might be in order to optimally space the jobs. Can you predict how large a transaction log backup will be before you run the backup? Alternatively, I could look at the past days' log backups on the local file system to determine the relative weights for each database. However, that's not really going to work for a brand new database that has little to no Transaction Log backup history.

Assuming it is possible to determine how relatively large each backup will be, what's an optimal algorithm to space them out to achieve minimal bandwidth impact when being metered at the 95th percentile?

Best Answer

DBCC SQLPERF (LOGSPACE) will return the size of the log file in MB & the current percentage used. From that you can calculate the current utilised portion of the log file in MB.

That by itself will not be sufficient to calculate log backup size, not least because you are using backup compression. However collecting this data over time & correlating it with the size of the backups might reveal a linear relationship.

You could insert a step into the Log Shipping backup job to record the size of the utilized portion of the log file and a another step after the backup to record the new size of the log file and the size of the backup.

Running this for a period and analyzing the results could allow you to estimate the size of the backup file.

Not the most straight forward method and to be honest it might not even work. If you are interested in giving it a go I would be interested in helping.