ZFS incremental snapshot send/recv without first snapshot

snapshotzfs

I have the following ZFS dataset:

pool/dataset
pool/dataset@snap1
pool/dataset@snap2
pool/dataset@snap3

Which has been replicated to a backup pool, using ZFS send/recv

backupPool/dataset
backupPool/dataset@snap1
backupPool/dataset@snap2

Afterwards, I deleted dataset@snap1 and dataset@snap2 from pool, and I am in a situation where I cannot send dataset@snap3 incrementally to backupPool.

Is there a way of solving this situation? For instance, generate a ZFS incremental snapshot between pool/dataset@snap3 and backupPool/dataset@snap2 and send it to backupPool? Or transfer back backupPool/dataset@snap2 to pool?

I could transfer pool/dataset@snap3 to a new dataset in backupPool, but I really need to keep the "history" of snapshots.

Best Answer

Thank you for all your suggestions!

I finally rsynced pool/dataset@snap3 with backupPool/dataset@snap2, deleted the dataset backupPool/dataset, and recreated it from backupPool/dataset. I was not able to find a better solution to this problem.

The suggestion from Dan was really helpful. Also, to avoid deleting snapshots in the future it is a good practice to hold them.

ZFS on the receiving side

If you use ZFS on the sending and on the receiving side you can avoid having to transfer the whole snapshot and only transfer the differences of the snapshot compared to the previous one:

ssh myserver 'zfs send -i pool/dataset@2014-02-04 pool/dataset@2014-02-05' | \
  zfs receive

ZFS knows about the snapshots and stores mutual blocks only once. Having the file system understand the snapshots enables you to delete the old ones without problems.

Other file system on the receiving side

In your case you store the snapshots in individual files, and your file system is unaware of the snapshots. As you already noticed, this breaks rotation. You either have to transmit entire snapshots, which will waste bandwidth and storage space, but enables you to delete individual snapshots. They don't depend on each other. You can do incremental snapshots like this:

ssh myserver 'zfs send -i pool/dataset@2014-02-04 pool/dataset@2014-02-05' \
  > incremental-2014-02-04:05

To restore an incremental snapshot you need the previous snapshots as well. This means you can't delete the old incrementals.

Possible solutions

You could do incrementals as shown in my last example and do a new non-incremental every month. The new incrementals depend on this non-incremental and you're free to delete the old snapshots.

Or you could look into other backup solutions. There is rsnapshot, which uses rsync and hard links. It does a very good job at rotation and is very bandwidth efficient, since it requires a full backup only once.

Then there is bareos. It does incrementals, which are bandwith- and space-saving. It has a very nice feature; it can calculate a full backup from a set of incrementals. This enables you to delete old incrementals. But it's a rather complex system and intended for larger setups.

The best solution, however, is to use ZFS on the receiving side. It will be bandwidth efficient, storage efficient and much faster than the other solutions. The only really drawback I can think of is that you should have a minimum of 8 GiB ECC memory on that box (you might be fine with 4 GiB if you don't run any services and only use it to zfs receive).

How to one-way mirror an entire zfs pool to another zfs pool

Disclaimer: As I've never used zvols, I cannot say if they are any different in replication than normal filesystems or snapshots. I assume they are, but do not take my word for it.

Your question is actually multiple questions, I try to answer them separately:

How to replicate/mirror complete pool to remote location

You need to split the task into two parts: first, the initial replication has to be complete, afterwards incremental replication is possible, as long as you do not mess with your replication snapshots. To enable incremental replication, you need to preserve the last replication snapshots, everything before that can be deleted. If you delete the previous snapshot, zfs recv will complain and abort the replication. In this case you have to start all over again, so try not to do this.

If you just need the correct options, they are:

zfs send:
- -R: send everything under the given pool or dataset (recursive replication, needed all the time, includes -p). Also, when receiving, all deleted source snapshots are deleted on the destination.
- -I: include all intermediate snapshots between the last replication snapshot and the current replication snapshot (needed only with incremental sends)
zfs recv:
- -F: expand target pool, including deletion of existing datasets that are deleted on the source
- -d: discard the name of the source pool and replace it with the destination pool name (the rest of the filesystem paths will be preserved, and if needed also created)
- -u: do not mount filesystem on destination

If you prefer a complete example, here is a small script:

#!/bin/sh

# Setup/variables:

# Each snapshot name must be unique, timestamp is a good choice.
# You can also use Solaris date, but I don't know the correct syntax.
snapshot_string=DO_NOT_DELETE_remote_replication_
timestamp=$(/usr/gnu/bin/date '+%Y%m%d%H%M%S')
source_pool=tank
destination_pool=tank
new_snap="$source_pool"@"$snapshot_string""$timestamp"
destination_host=remotehostname

# Initial send:

# Create first recursive snapshot of the whole pool.
zfs snapshot -r "$new_snap"
# Initial replication via SSH.
zfs send -R "$new_snap" | ssh "$destination_host" zfs recv -Fdu "$destination_pool"

# Incremental sends:

# Get old snapshot name.
old_snap=$(zfs list -H -o name -t snapshot -r "$source_pool" | grep "$source_pool"@"$snapshot_string" | tail --lines=1)
# Create new recursive snapshot of the whole pool.
zfs snapshot -r "$new_snap"
# Incremental replication via SSH.
zfs send -R -I "$old_snap" "$new_snap" | ssh "$destination_host" zfs recv -Fdu "$destination_pool"
# Delete older snaps on the local source (grep -v inverts the selection)
delete_from=$(zfs list -H -o name -t snapshot -r "$source_pool" | grep "$snapshot_string" | grep -v "$timestamp")
for snap in $delete_from; do
    zfs destroy "$snap"
done

Use something faster than SSH

If you have a sufficiently secured connection, for example IPSec or OpenVPN tunnel and a separate VLAN that only exists between sender and receiver, you may switch from SSH to unencrypted alternatives like mbuffer as described here, or you could use SSH with weak/no encryption and disabled compression, which is detailed here. There also was a website about recomiling SSH to be much faster, but unfortunately I don't remember the URL - I'll edit it later if I find it.

For very large datasets and slow connections, it may also be useful to to the first transmission via hard disk (use encrypted disk to store zpool and transmit it in sealed package via courier, mail or in person). As the method of transmission does not matter for send/recv, you can pipe everything to the disk, export the pool, send the disk to its destination, import the pool and then transmit all incremental sends via SSH.

The problem with messed up snapshots

As stated earlier, if you delete/modify your replication snapshots, you will receive the error message

cannot send 'pool/fs@name': not an earlier snapshot from the same fs

which means either your command was wrong or you are in an inconsistent state where you must remove the snapshots and start all over.

This has several negative implications:

You cannot delete a replication snapshot until the new replication snapshot was successfully transferred. As these replication snapshots include the state of all other (older) snapshots, empty space of deleted files and snapshots will only be reclaimed if the replication finishes. This may lead to temporary or permanent space problems on your pool which you can only fix by restarting or finishing the complete replication procedure.
You will have many additional snapshots, which slows down the list command (except on Oracle Solaris 11, where this was fixed).
You may need to protect the snapshots against (accidental) removal, except by the script itself.

There exists a possible solution to those problems, but I have not tried it myself. You could use zfs bookmark, a new feature in OpenSolaris/illumos created specifically for this task. This would free you of snapshot management. The only downside is that at present, it only works for single datasets, not recursively. You would have to save a list of all your old and new datasets and then loop over them, bookmarking, sending and receiving them, and then updating the list (or small database, if you prefer).

If you try the bookmark route, I would be interested to hear how it worked out for you!