FreeBSD – ZFS Snapshot to File as Backup with Rotation

backupfreebsdfreenassnapshotzfs

I have a local FreeNAS system and want to use ZFS snapshots for backups.
FreeNAS has the built-in Replication Tasks which use

zfs send snapshot_name

to send a snapshot to a remote system.
But this needs a system with ZFS on the other end.

I want to send the snapshot to a a file and send this compressed and encrypted file to the remote machine.

This is possible with

zfs send snapshot_name | gzip | openssl enc -aes-256-cbc -a -salt > file.gz.ssl

Everyday I make a snapshot of the storage pool and keep every snapshot for 30 days.
With every snapshot taken I'll pipe this snapshot to a file.
– snapshot_file 1 has every file in it (let's say 2GB)
– snapshot_file 2 only has the changes to snapshot_file 1 (let's say 5MB)
– snapshot_file 3 holds the changes to snapshot_file 2; and so on.

On day 31 snapshot_file 1 is getting deleted (because I only want the changes from the last 30 days)

Therefore snapshot_file 2 needs to hold every file (2GB of snapshot_file 1 + 5MB changes)

But with this approach everyday (from day 31 on) a new 2GB file has to be created and send to a remote system. This is too much overhead.

What would be the best approach to use snapshots piped to a file as a backup strategy with a history of X days?

P.S.: I know there are a lot of backup software out there (rdiff-backup for example), which I could use. But I am curious how this could be done.

Best Answer

If you store the snapshots in files, as opposed to in the file system (e.g. with zfs receive), I'm afraid, this is not possible.

ZFS on the receiving side

If you use ZFS on the sending and on the receiving side you can avoid having to transfer the whole snapshot and only transfer the differences of the snapshot compared to the previous one:

ssh myserver 'zfs send -i pool/dataset@2014-02-04 pool/dataset@2014-02-05' | \
  zfs receive

ZFS knows about the snapshots and stores mutual blocks only once. Having the file system understand the snapshots enables you to delete the old ones without problems.

Other file system on the receiving side

In your case you store the snapshots in individual files, and your file system is unaware of the snapshots. As you already noticed, this breaks rotation. You either have to transmit entire snapshots, which will waste bandwidth and storage space, but enables you to delete individual snapshots. They don't depend on each other. You can do incremental snapshots like this:

ssh myserver 'zfs send -i pool/dataset@2014-02-04 pool/dataset@2014-02-05' \
  > incremental-2014-02-04:05

To restore an incremental snapshot you need the previous snapshots as well. This means you can't delete the old incrementals.

Possible solutions

You could do incrementals as shown in my last example and do a new non-incremental every month. The new incrementals depend on this non-incremental and you're free to delete the old snapshots.

Or you could look into other backup solutions. There is rsnapshot, which uses rsync and hard links. It does a very good job at rotation and is very bandwidth efficient, since it requires a full backup only once.

Then there is bareos. It does incrementals, which are bandwith- and space-saving. It has a very nice feature; it can calculate a full backup from a set of incrementals. This enables you to delete old incrementals. But it's a rather complex system and intended for larger setups.

The best solution, however, is to use ZFS on the receiving side. It will be bandwidth efficient, storage efficient and much faster than the other solutions. The only really drawback I can think of is that you should have a minimum of 8 GiB ECC memory on that box (you might be fine with 4 GiB if you don't run any services and only use it to zfs receive).