Linux – Create a differential backup with rsync of a directory on the local drive to another directory on the same drive

backuplinuxrsync

How can I use rsync (but neither rsnapshot nor rdiff-backup nor any other application) to create a differential backup of a directory located on my local drive to another directory located on that same local drive?

F. Hauri posted the following in an anwser to How to create a local backup?:

#!/bin/bash
backRepo=/media/mydisk
backSrce=/home/user
backDest=home
backCopy=copy
backCount=9

[ -d "$backRepo/$backDest" ] || mkdir "$backRepo/$backDest"

cd $backSrce || exit 1
rsync -ax --delete --exclude '*~' --exclude '.DStore' . "$backRepo/$backDest/."

cd $backRepo
[ -d "$backCopy.$backCount" ] && rm -fR "$backCopy.$backCount"
for ((i=$backCount;i--;));do
    [ -d "$backCopy.$i" ] && mv "$backCopy.$i" "$backCopy.$((i+1))"
  done
((i++))

cp -al $backDest $backCopy.$i

It seems like the above script is fairly close to what I want, but frankly despite spending about an hour studying Easy Automated Snapshot-Style Backups with Linux and Rsync I still only have a vague idea of how to make rsync do what I want.

Here's my use case:

I am editing a video locally on my machine. The sum of all of the hundreds of files associated with that video will be less than 5 gb (five gigabytes).

Currently, I use Grsync to back up my internal drive to an external USB drive. Although I actually figured out how to accomplish the identical task using rsync I prefer using Grsync because I merely need to launch it and then click on one button to backup my internal directory containing my video files to my external USB drive. The entire process is silky smooth.

Every few hours, I want a fairly smooth way to back up my the above-mentioned data associated with my video, to my Google Drive account. I don’t mind manually choosing to upload a folder to Google Drive. I actually sort of prefer having to do so because it would help me to ensure the backup was actually being accomplished.

Every few nights before I go to bed, I have been copying the entire folder containing the video files, which contains many gigs of data, up to my Google Drive account.

I prefer differential backups to incremental ones because in case I were to need to restore my data from Google Drive I would likely be able to do so manually without becoming confused.

Please keep in mind that I am certainly not a unix sys admin at a large corporation supporting hundreds of users. I am a merely one guy who wants an easy method, but not necessarily a completely automated method, to back up his data offsite every few hours in case of a catastrophic loss of data, which would be most likely due to the theft of my computer. I am almost certain rsync can do what I want. Therefore, I am reluctant to install another application.

Best Answer

Here ya go!

#!/bin/bash

# written by strobelight, you know who you are.
# license, MIT, go for it.

me=`basename $0`

EXCLUDES="\
    --exclude '*~'
    --exclude '.DS_Store'
"

CANDIDATES=/tmp/candidates

usage() {
    cat <<EOF

$me last_diff_dir new_diff_dir [ dir_to_copy ]

where:
    last_diff_dir  is the directory containing the last differential
    new_diff_dir   is the directory you want files saved to
    dir_to_copy    is optional and is the directory to copy from (default .)

cd directory_to_backup
Full backup: $me full_back full_back
Diff backup: $me full_back diff_1
Diff backup: $me full_back diff_2

EOF
    exit 1
}

get_dir() {
    HERE=`pwd`
    cd $1
    x=`pwd`
    cd $HERE
    echo $x
}

if [ $# -lt 2 ]; then
    usage
fi

LAST_DIR="$1"
NEW_DIR="$2"
DIR_TO_COPY="${3:-.}"

mkdir -p "$LAST_DIR" || exit 1
mkdir -p "$NEW_DIR" || exit 1

[ -d "$LAST_DIR" ] || usage
[ -d "$NEW_DIR" ] || usage
[ -d "$DIR_TO_COPY" ] || usage

LAST_DIR=`get_dir "$LAST_DIR"`
NEW_DIR=`get_dir "$NEW_DIR"`
DIR_TO_COPY=`get_dir "$DIR_TO_COPY"`

# get list of what's different
eval rsync -v --dry-run -axH --delete --update $EXCLUDES "$DIR_TO_COPY/" "$LAST_DIR" | awk '
    /building file list/ { next }
    /^$/ {next}
    /bytes.*received/ { nextfile }
    {
        for(i=5;i<NF;i++) {
            printf("%s ",$i)
        }
        printf("%s\n",$NF)
    }
    ' | sed 's:/$::' > $CANDIDATES
#cat $CANDIDATES

# use list to backup 
eval rsync --files-from=$CANDIDATES -lptgoDxH --delete $EXCLUDES ${DIR_TO_COPY}/ $NEW_DIR

For example, my current directory has 3 8k files:

$ ls -1sk
total 24
 8 seg1
 8 seg2
 8 seg3

My full backup doesn't yet exist, let's call that directory full_bak

ls ../full_bak
ls: ../full_bak: No such file or directory

First we need a full backup from which to do differentials. I've copied the script to my $HOME/bin directory as test123.sh. When both args are the same, that's essentially performing a full backup.

$HOME/bin/test123.sh ../full_bak ../full_bak

script outputs

.
seg1
seg2
seg3

Now look at ../full_bak

$ ls -1sk ../full_bak
total 24
 8 seg1
 8 seg2
 8 seg3

Make some changes

dd if=/dev/zero of=seg2 bs=512 count=11

Confirm there are differences:

$ diff -q . ../full_bak
Files ./seg2 and ../full_bak/seg2 differ

Now create a differential

$ $HOME/bin/test123.sh ../full_bak ../differential1
seg2

Look at differential having just the file thats different from the last full backup

$ ls -1sk ../differential1/
total 8
 8 seg2

Make another change

dd if=/dev/zero of=seg4 bs=512 count=10

Check what's different

diff -q . ../full_bak
Files ./seg2 and ../full_bak/seg2 differ
Only in .: seg4

and see we have a new file that's not in our full backup, and a changed file from before.

Do another differential to another directory

$ $HOME/bin/test123.sh ../full_bak ../differential2
.
seg2
seg4

and see the new differential has the 1st differential as well as the new file

$ ls -1sk ../differential2
total 16
 8 seg2
 8 seg4

Differential Backups

Here's a fullbackup wrapper using test123.sh:

#!/bin/bash

FULLDIR=/media/mydisk/home
SRCDIR=/home/user

$HOME/bin/test123.sh $FULLDIR $FULLDIR $SRCDIR

Here's a differential script creating sub directories based on the hour:

#!/bin/bash

FULLDIR=/media/mydisk/fullbackup/home
DIFFDIR=/media/mydisk/differentials/home
SRCDIR=/home/user
DIFFSUB=`date '+BAK_%H'`

$HOME/bin/test123.sh $FULLDIR $DIFFDIR/$DIFFSUB $SRCDIR
Related Question