How to do an efficient incremental backup to cloud

backupcompressionrsync

I have a folder that contain files of current and previous projects that I plan backup using versioned rsync. For more a more robust backup strategy I want to store a monthly snapshot offsite (eg amazon glacier) at regular intervals.

To save space and bandwidth I want to compress the the backup before sending it offsite. However, since only a small fraction of the total number of files change from month to month, sending the whole compressed library each backup will also be a huge waste of bandwidth.

Ideally what I want to do, is to compress the backup into volumes of 500mb (or some other size) and upload them to my offsite storage. Next time I backup, most of these volumes should be identical to the previous backup, except for those containing files that have been changed since the last backup. In this scenario I only need to upload the changed volumes, saving bandwidth (and file write requests).

Is it possible to do what I describe using a combination of tar and gzip (split maybe?). Or other command line tools?

One issue I can imagine is that if a change happens to a file contained in some volume, the content of all the subsequent volumes may be offset, requiring a re-upload of the changed volume and the subsequent volumes. Perhaps its better to segment the volumes by folders somehow?

I would love to hear any input or suggestion you have
Best regards
M

Best Answer

tar can do this with the --listed-incremental flag so as described I would probably do that. You can use whatever compressors tar supports to compress it (or just pipe it through an arbitrary compressor). See https://www.gnu.org/software/tar/manual/html_section/tar_39.html

I'm not sure what sort of projects these are, but if it's code or some other text-based format I'd probably look into using git or some other source control system.

I should also point out that this is GNU tar. If you are on a BSD or other unix, you might need to install gnutar because I don't think bsdtar supports this.

Related Question