I'm looking for a backup utility with incremental backups, but in a more complicated way.
I tried rsync, but it doesn't seem to be able to do what I want, or more likely, I don't know how to make it do that.
So this is an example of what I want to achieve with it.
I have the following files:
testdir
├── picture1
├── randomfile1
├── randomfile2
└── textfile1
I want to run the backup utility and basically create an archive (or a tarball) of all of these files in a different directory:
$ mystery-command testdir/ testbak
testbak
└── 2020-02-16--05-10-45--testdir.tar
Now, let's say the following day, I add a file, such that my structure looks like:
testdir
├── picture1
├── randomfile1
├── randomfile2
├── randomfile3
└── textfile1
Now when I run the mystery command, I will get another tarball for that day:
$ mystery-command testdir/ testbak
testbak
├── 2020-02-16--05-10-45--testdir.tar
└── 2020-02-17--03-24-16--testdir.tar
Here's the kicker: I want the backup utility to detect the fact that picture1
, randomfile1
, randomfile2
and textfile1
have not been changed since last backup, and only backup the new/changed files, which in this case is randomfile3
, such that:
tester@raspberrypi:~ $ tar -tf testbak/2020-02-16--05-10-45--testdir.tar
testdir/
testdir/randomfile1
testdir/textfile1
testdir/randomfile2
testdir/picture1
tester@raspberrypi:~ $ tar -tf testbak/2020-02-17--03-24-16--testdir.tar
testdir/randomfile3
So as a last example, let's say the next day I changed textfile1
, and added picture2
and picture3
:
$ mystery-command testdir/ testbak
testbak/
├── 2020-02-16--05-10-45--testdir.tar
├── 2020-02-17--03-24-16--testdir.tar
└── 2020-02-18--01-54-41--testdir.tar
tester@raspberrypi:~ $ tar -tf testbak/2020-02-16--05-10-45--testdir.tar
testdir/
testdir/randomfile1
testdir/textfile1
testdir/randomfile2
testdir/picture1
tester@raspberrypi:~ $ tar -tf testbak/2020-02-17--03-24-16--testdir.tar
testdir/randomfile3
tester@raspberrypi:~ $ tar -tf testbak/2020-02-18--01-54-41--testdir.tar
testdir/textfile1
testdir/picture2
testdir/picture3
With this system, I would save space by only backing up the incremental changes between each backup (with obviously the master backup that has all the initial files), and I would have backups of the incremental changes, so for example if I made a change on day 2, and changed the same thing again on day 3, I can still get the file with the change from day 2, but before the change from day 3.
I think it's kinda like how GitHub works 🙂
I know I could probably create a script that runs a diff and then selects the files to backup based on the result (or more efficiently, just get a checksum and compare), but I want to know if there's any utility that can do this a tad easier 🙂
Best Answer
Update:
Please see some caveats here: Is it possible to use tar for full system backups?
According to that answer, restoration of incremental backups with tar is prone to errors and should be avoided. Do not use the below method unless you're absolutely sure you can recover your data when you need it.
According to the documentation you can use the -g/--listed-incremental option to create incremental tar files, eg.
Then next time do something like
Where data.inc is your incremental metadata, and DATE-data.tar are your incremental archives.