Rsync: compare directories with symlinks

rsyncsymlink

There are various similar questions here, but I don't think any of them quite answer this.

I have a local directory (www) which has to be mirrored exactly on a remote server, and I need to confirm that they're exact copies.

www contains symlinks to other files inside www (the links are to both regular files and directories). This command will compare the two versions of www, excluding local svn directories as a bonus, and print only the different files:

[local]$ rsync -rvnc --exclude=.svn/ --delete www/ remote:/var/www/

But it doesn't follow symlinks, and just reports that they're not regular files. So, I need to dereference the symlinks, and compare their targets. Neither rsync -rvncL nor rsync -rvncK does this – -L prints out lots of files that don't differ (that I can see, anyway), and -K is doing nothing. Any ideas?

Best Answer

Using normal commands with symlinks is very tricky. find command handles them pretty well. So the key is to use find, and I'd suggest using a fast CRC or cryptographic hash function depending on your needs.

So something like this should work (you can make supplements as needed)

find -L www -type f -exec cksum {} \; | cut -d ' ' -f1-2 | md5sum

If you want cryptographic backing for your checksums,

find -L www -type f -exec sha256sum {} \; | cut -d ' ' -f1 | sha256sum

The 'type' identifier is required because sha256 only works on files and errors out for directories, and the cut is only used to pass the checksum/hash to the final function and avoid false positives when in reality only the path names are different.

Note: This will fail in case the symlink is not relative and if the two systems don't have the exact same path which can happen if a symlink points outside the directory you are running find on.

Related Question