Btrfs – How to Check if Send/Receive Worked Properly

btrfs

btrfs send and receive can be used to transfer terabytes of data, but these commands don't produce helpful progress output (even with -v). How can I check if they succeeded?

For example, if I create a new subvolume called source, write 1 GB of random data into it, and make it read-only so that it can be sent:

# btrfs subvolume create source
# head -c 1G < /dev/urandom > source/data
# btrfs property set source ro true

Then, create a copy of the new subvolume using btrfs send and receive, but interrupt the process before it completes:

# mkdir destination
# btrfs send source | btrfs receive destination
At subvol source
At subvol source
^C

btrfs subvolume list will not indicate that anything has gone wrong:

# btrfs subvolume list .
ID 1216 gen 370739 top level 5 path source
ID 1219 gen 371244 top level 5 path destination/source

The new subvolume can be browsed normally, although clearly its data is corrupt:

# exa -lT
   - ├── destination
   - │  └── source
251M │     └── random_data
   - └── source
1.1G    └── random_data

btrfs subvolume show destination/source does not warn us that the subvolume is incomplete. It does show that destination/source has a different UUID to source, and it looks as though destination/source's Received UUID will be set to source's UUID if and only if btrfs receive ran to completion.

Does the presence of the Received UUID guarantee that a subvolume created by btrfs receive is a complete and unmodified copy of the subvolume with that UUID on another filesystem?

This part of man btrfs-send suggests not, and seems to imply that using destination/source in the above example as the parent of a future snapshot of source would fail to detect and repair the corruption as well. However, I'm still not completely clear on the purpose of send -c and whether this advice also applies to send -p.

In the incremental mode (options -p and -c), previously sent snapshots that are available on both the sending and receiving side can be used to reduce the amount of information that has to be sent to reconstruct the sent snapshot on a different filesystem.

The -p <parent> option can be omitted when -c <clone-src> options are given, in which case btrfs send will determine a suitable parent from among the clone sources.

You must not specify clone sources unless you guarantee that these snapshots are exactly in the same state on both sides—both for the sender and the receiver.

From what I can tell, snap-sync, buttersink and other similar tools deal with this problem by redirecting the output of btrfs send to a series of files, and transferring them using a reliable method like rsync rather than a simple pipe. Is that the right approach to take, if I want to develop my own incremental backup solution without relying on third-party software that isn't packaged by my distro?

Best Answer

i have more of ten backup systems based exactly on last part of what you said. Direct pipes have never been an option to me, since i deal with backup over network that are > 1TB. Could not risk to lose a single bit and waste hours of work.

My final setup is as follows.

Bootstrap Phase:

  1. Take first full snapshot
  2. Send snapshot to local file (-f option)
  3. Rsync or physical media transfer of snapshot file to remote site.
  4. Remote receive of first snapshot

Incremental Phase:

  1. New local snapshot

  2. Local generation and send to file of diff between current and last snapshot

  3. Rsync to remote site

  4. Remote import of transferred snapshot file

  5. Cleaning logic (think about retention, remove old snapshots...)

This is up and running since 3 years. On worst cases, when snapshots don't match, it's enough to delete last two (1 local, 1 remote) to have it working again with next send.

Good luck

Related Question