Linux – Rsync backups on btrfs very slow

backupbtrfshardlinklinuxrsync

My environment is Ubuntu 15.04 with kernel 3.19.0-28-generic and Btrfs v3.17.

I have two identical external USB hard disks that I use with my backup script. One of them is formatted with btrfs and the other one with ext4. The source filesystem is always ext4. The rsync command looks like this:

rsync --inplace --no-whole-file --link-dest="$previousBackup" "$sourceDir" "$destDir"

I just realized that the backup performed on btrfs takes an extremely long time: Slightly more than one hour, in comparison to the 4 minutes that it takes to perform the same copy to ext4.

To rule out disk malfunctioning I performed some benchmarks, with dd and the “disk utility” shipped with Ubuntu, but I have got the same performance on both disks. The slow part seems to be hardlinking against the previous backup. Even after a defrag and scrub, the following command takes around 53 minutes on btrfs, but only 1 minute on ext4:

cp -arl "$previousBackup" "$destDir"

By researching on the Internet, I found hints that the performance of btrfs suffers with hardlinks, but I would not expect this huge difference. I found out that this command is faster, but still takes over 30 minutes to complete:

cp -ar --reflink "$previousBackup" "$destDir"

Does anyone have experience with this behaviour and can confirm it? Is there any simple way to correct it (e.g. different mount options) or should I try to delete as many hardlinks as possible and just use reflinks?

EDIT

I just found out that even deleting a directory from btrfs requires more than one hour. The same operation is instantaneous on the "twin" ext4 disk. There is obviously a problem with metadata here.

Best Answer

You say you are copying hardlinks with your rsync command, but where is the -H flag? I don’t see it in your command:

rsync --inplace --no-whole-file --link-dest="$previousBackup" "$sourceDir" "$destDir"

The way I understand how rsync works—with regards to hardlinks—is that without the -H flag actual data is copied instead of the hardlink as explained on the rsync man page:

-H, --hard-links

This tells rsync to look for hard-linked files in the transfer and link together the corresponding files on the receiving side. Without this option, hard-linked files in the transfer are treated as though they were separate files.

I can imagine such a procedure where many similar files are copied over and over instead of being hardlinked would add up to slower transfer time.

Also, consider using the -z (--compress) flag as well:

-z, --compress

With this option, rsync compresses the file data as it is sent to the destination machine, which reduces the amount of data being transmitted -- something that is useful over a slow connection.

Yes, this is a USB to USB transfer on the same system so it’s likely speed is already optimized, but it doesn’t hurt to see of -z will perhaps help you overcome natural USB data transfer bottlenecks.

A nice, simple tutorial that explains these flags—as well as others—can be found here.

Related Question