Sql-server – How to backup the large SQL Server table

backupsql serversql-server-2008-r2

Environment: SQL Server 2008 R2 Standard Edition on Windows Server 2008 R2 (yeah – I know). VMWare, Commvault.

We have a 1.4 TB DB with one table. Here is the schema:

CREATE TABLE [dbo].[FileManager_FilesContent](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [GUID] [uniqueidentifier] NOT NULL,
    [FileContent] [varbinary](max) NOT NULL,
    [Extension] [varchar](20) NULL,
    [IsDeleted] [bit] NULL,
 CONSTRAINT [PK_FileManager_FilesContent] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_FileManager_FilesContent] ON [dbo].[FileManager_FilesContent]
(
    [GUID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

We have a network share where we backup the other DBs. This one won't backup there because it would take more than 40 hours. My thought is a local drive (or drives) plus backup compression, striping, and dialing in maxtransfersize and buffercount might help.

Any other advice on how to get this DB backed up?

Thanks in advance,
-Peter

Best Answer

You could partition the table. Put each partition in a separate filegroup via the partition scheme. Each filegroup can be backed up individually. From SQL Server 2019 there is a DMV that makes available information on which pages, and hence files, are dirty.

There will still be 1.4TB to backup so a complete backup cycle will still take at least 40 hours. However, each individual filegroup will take much less time avoiding downtime, network congestion and all the other worries.

If writes typically touch only a few partitions you could establish some form of change tracking - either built-in or application code - and only backup the dirty filegroups.

Depending on when rows are written you may be able to make some filegroups read-only. These can then be omitted from the on-going backup cycles once a final backup is taken, reducing elapsed.

Potentially backups and transfers to network shares could happen in parallel reducing the amount of local workspace required (backup partition 1, transfer that over the network while partition 2's backup is prepared, then transfer 2 while 3 is backed up and so on; each local backup can be deleted as soon as it is on the network drive).

If you can arrange things appropriately you may be able to implement piecemeal restores which will certainly improve your RTO from 40hr.

Of course the scripting and restores will be that bit more complicated.