Sql-server – the best SQL server backup strategy for large DB with few tables and lots of immutable data

backupperformancesql serversql-server-2017

We have a large DB (500GB and expanding). 95%+ of the data is stored in 3 tables (one table has 2bn+ rows). The data is largely immutable – i.e. once it's added it's only then read afterwards. We can't archive old data.

We're using SQL Server 2017 which allows for compressed backups but even then it takes a long time to backup and copy across the wire to backup server.

We'd like to speed up this process (and for DR purposes have the backups in the cloud – probably Azure) – Differential backups are small enough, but I think that we'd still need to regularly take full backups (e.g. full each weekend and differential each night is still a problem for us)

My feeling is to use partitioning to split the large tables on some sensible column (clustered on identity ID) then we can back up the older partitions just once and mark them as read-only and not need to back it all again.

This is not an ideal situation as we're going to need to keep adding more partitions in the future. Also the DB cannot be taken down for more than a few minutes so I think I'm going to have to make a shadow version of the data with the partitions then do some switch-over so reduce downtime which is a bit risky and complicated.

If anyone has backup strategies for this sort of DB configuration that they think would work better (or can confirm that this seems like a good idea given the constraints that I have) I'd be glad to hear 🙂

Additional information:

Current backup schedule:

  1. Full backup (nightly) – compressed backups are about 100 GB (500 GB uncompressed) and take about 40 minutes (with compression)

  2. Log backups (Every 10 minutes) – Almost instant and only about 20 MB each.

Now I know that, for some, 40 minutes is not a very long and time and 100 GB is not a very large file, but I also know that given that 95%+ of the data is immutable and could be safely backed up only once, the backups could take less than a couple of minutes and be a couple of gigabytes (and that's being conservative).

I believe that partitioning is one of the tools used to help manage backups and specifically for this type of scenario, and I am hoping to get someone with real-world experience with this (or SQL Server based alternatives for my scenario) to shed some light on what works for them.

Best Answer

I think you should look into log shipping once you get it setup your 10 minutes logs can be shipped to the server, so only changes are updated.

Assuming the log shipping destination is also where you want your backups kept, you can do backups there. Other then the initial backup, you can run for years and only ship the logs.

You could take full backups at both your primary and log shipping destination. That way you would have backups in both locations, and your only data moving daily is the logs being shipped.

Alternately you can use AlwaysOn instead of log shipping for the same effect.

Licensing: I am not an expert on licensing, but my understanding is that if the Log shipping (or AlwaysOn) destination instance is for DR only, there is no additional cost.