Rsync – How to Compare Two Folders and Copy Differences

diff()file-copyfilesrsync

You've got three folders:

  • folder current, which contains your current files
  • folder old, which contains an older version of the same files
  • folder difference, which is just an empty folder

How do you compare old with current and copy the files which are different (or entirely new) in current to difference?


I have searched all around and it seems like a simple thing to tackle, but I can't get it to work in my particular example. Most sources suggested the use of rsync so I ended up with the following command:

rsync -ac --compare-dest=../old/ new/ difference/

What this does however, is copies all the files from new to difference, even those which are the same as in old.

In case it helps (maybe the command is fine and the fault lies elsewhere), this is how I tested this:

  1. I made the three folders.
  2. I made several text files with different contents in old.
  3. I copied the files from old to new.
  4. I changed the contents of some of the files in new and added a few additional files.
  5. I ran the above command and checked the results in difference.

I have been looking for a solution for the past couple of days and I'd really appreciate some help. It doesn't necessarily have to be using rsync, but I'd like to know what I'm doing wrong if possible.

Best Answer

I am not sure whether you can do it with any existing linux commands such as rsync or diff. But in my case I had to write my own script using Python, as python has the "filecmp" module for file comparison. I have posted the whole script and usage in my personal site - http://linuxfreelancer.com/

It usage is simple - give it the absolute path of new directory, old directory and difference directory in that order.

#!/usr/bin/env python

import os, sys
import filecmp
import re
from distutils import dir_util
import shutil

holderlist = []


def compareme(dir1, dir2):
    dircomp = filecmp.dircmp(dir1, dir2)
    only_in_one = dircomp.left_only
    diff_in_one = dircomp.diff_files
    dirpath = os.path.abspath(dir1)
    [holderlist.append(os.path.abspath(os.path.join(dir1, x))) for x in only_in_one]
    [holderlist.append(os.path.abspath(os.path.join(dir1, x))) for x in diff_in_one]
    if len(dircomp.common_dirs) > 0:
        for item in dircomp.common_dirs:
            compareme(
                os.path.abspath(os.path.join(dir1, item)),
                os.path.abspath(os.path.join(dir2, item)),
            )
        return holderlist


def main():
    if len(sys.argv) > 3:
        dir1 = sys.argv[1]
        dir2 = sys.argv[2]
        dir3 = sys.argv[3]
    else:
        print "Usage: ", sys.argv[0], "currentdir olddir difference"
        sys.exit(1)

    if not dir3.endswith("/"):
        dir3 = dir3 + "/"

    source_files = compareme(dir1, dir2)
    dir1 = os.path.abspath(dir1)
    dir3 = os.path.abspath(dir3)
    destination_files = []
    new_dirs_create = []
    for item in source_files:
        destination_files.append(re.sub(dir1, dir3, item))
    for item in destination_files:
        new_dirs_create.append(os.path.split(item)[0])
    for mydir in set(new_dirs_create):
        if not os.path.exists(mydir):
            os.makedirs(mydir)
    # copy pair
    copy_pair = zip(source_files, destination_files)
    for item in copy_pair:
        if os.path.isfile(item[0]):
            shutil.copyfile(item[0], item[1])


if __name__ == "__main__":
    main()
Related Question