btrfs send
and receive
can be used to transfer terabytes of data, but these commands don't produce helpful progress output (even with -v
). How can I check if they succeeded?
For example, if I create a new subvolume called source
, write 1 GB of random data into it, and make it read-only so that it can be sent:
# btrfs subvolume create source
# head -c 1G < /dev/urandom > source/data
# btrfs property set source ro true
Then, create a copy of the new subvolume using btrfs send
and receive
, but interrupt the process before it completes:
# mkdir destination
# btrfs send source | btrfs receive destination
At subvol source
At subvol source
^C
btrfs subvolume list
will not indicate that anything has gone wrong:
# btrfs subvolume list .
ID 1216 gen 370739 top level 5 path source
ID 1219 gen 371244 top level 5 path destination/source
The new subvolume can be browsed normally, although clearly its data is corrupt:
# exa -lT
- ├── destination
- │ └── source
251M │ └── random_data
- └── source
1.1G └── random_data
btrfs subvolume show destination/source
does not warn us that the subvolume is incomplete. It does show that destination/source
has a different UUID
to source
, and it looks as though destination/source
's Received UUID
will be set to source
's UUID
if and only if btrfs receive
ran to completion.
Does the presence of the Received UUID
guarantee that a subvolume created by btrfs receive
is a complete and unmodified copy of the subvolume with that UUID on another filesystem?
This part of man btrfs-send
suggests not, and seems to imply that using destination/source
in the above example as the parent of a future snapshot of source
would fail to detect and repair the corruption as well. However, I'm still not completely clear on the purpose of send -c
and whether this advice also applies to send -p
.
In the incremental mode (options
-p
and-c
), previously sent snapshots that are available on both the sending and receiving side can be used to reduce the amount of information that has to be sent to reconstruct the sent snapshot on a different filesystem.The
-p <parent>
option can be omitted when-c <clone-src>
options are given, in which case btrfs send will determine a suitable parent from among the clone sources.You must not specify clone sources unless you guarantee that these snapshots are exactly in the same state on both sides—both for the sender and the receiver.
From what I can tell, snap-sync
, buttersink
and other similar tools deal with this problem by redirecting the output of btrfs send
to a series of files, and transferring them using a reliable method like rsync
rather than a simple pipe. Is that the right approach to take, if I want to develop my own incremental backup solution without relying on third-party software that isn't packaged by my distro?
Best Answer
i have more of ten backup systems based exactly on last part of what you said. Direct pipes have never been an option to me, since i deal with backup over network that are > 1TB. Could not risk to lose a single bit and waste hours of work.
My final setup is as follows.
Bootstrap Phase:
Incremental Phase:
New local snapshot
Local generation and send to file of diff between current and last snapshot
Rsync to remote site
Remote import of transferred snapshot file
Cleaning logic (think about retention, remove old snapshots...)
This is up and running since 3 years. On worst cases, when snapshots don't match, it's enough to delete last two (1 local, 1 remote) to have it working again with next send.
Good luck