ZFS: send / receive with rolling snapshots

zfs

My small home server runs on a distribution featuring ZFS.
On that system, I implemented a rolling snapshot scheme:

  • every hour, a snapshot is created
  • once a day, the chain is thinned so that I have a set of hourly / daily / weekly / monthly snapshots

I would like to store an offsite backup of some of the file systems on a USB drive in my office. The plan is to update the drive every other week.
However, due to the rolling snapshot scheme, I have troubles implementing incremental snapshots.

To given you an illustration, this is my desired procedure:

  1. Initial snapshot: zfs snap tank/fs@snap0
  2. Transfer initial snapshot: zfs send tank/fs@snap0 | zfs recv -Fduv backup_tank
  3. Store backup_tank offsite
  4. Make a few snapshots:
    zfs snap tank/fs@snap1,
    zfs snap tank/fs@snap2
  5. Thin the chain:
    zfs destroy tank/fs@snap0
  6. Return backup_tank and make an incremental update of the filesystem
  7. Obviously, zfs send -I snap0 tank/fs@snap2 | zfs recv -Fduv backup_tank fails as snap0 does not exist on tank anymore.

Long story cut short:

Is there a clever solution for combining thinning of snapshot chains and incremental send / recv? Every time I attach the drive and run some commands I would like to a have a copy of the file system at that point of time. In this example, backup_tankshould contain the snapshots fs@snap1 and fs@snap2.

Best Answer

You can't do exactly what you want.

Whenever you create a zfs send stream, that stream is created as the delta between two snapshots. (That's the only way to do it as ZFS is currently implemented.) In order to apply that stream to a different dataset, the target dataset must contain the starting snapshot of the stream; if it doesn't, there is no common point of reference for the two. When you destroy the @snap0 snapshot on the source dataset, you create a situation that is impossible for ZFS to reconcile.

The way to do what you are asking is to keep one snapshot in common between both datasets at all times, and use that common snapshot as the starting point for the next send stream.

So, you might in step 1 create a snapshot @backup0, and then some time around step 6 create and use a snapshot @backup1 to use for updating the off-site backup. You then transfer the stream that is the delta between @backup0 and @backup1 (which will include all intermediate snapshots), then delete @backup0 but keep @backup1 (which becomes the new common denominator). Next time you refresh the backup, you might create @backup2 (instead of @backup1) and transfer the delta between @backup1 and @backup2 (instead of @backup0 and @backup1) followed by deleting @backup1 (instead of @backup0), and so on.