One-way-sync a directory, but leave deleted files deleted on the destination

rsyncsynchronization

I want to sync a directory between two systems. To make it more interesting the syncing must only be done in one direction, i.e.:

  • if a file is deleted in the source directory, it must also be deleted in the destination, if it was previously transfered
  • deleted files in the destination directory must not be deleted in the source
  • partially transfered files (e.g. because of network problems) must be finished on the next sync
  • new files in the source directory must be transfered to the destination
  • deleted files in the destination directory must not be re-transfered

That means the source system has basically a master role, except that deleted files in the destination will not be forced back.

Both Linux systems have rsync/ssh/scp available.

New files in the source directory are created in such a way that one can use their mtime to detect them, e.g.:

if mtime(file) > date-of-last-sync then: it is a new file that needs to be transfered

Also, existing files are not changed in the source directory, i.e. the sync does not need to check for differences in already (completely) transfered files.

Best Answer

If you're not going to use the remote file system as the data source of what has been transferred then you need to externally track the files that have been successfully transferred previously, then exclude them from future transfers.

rsync can include and exclude files based on patterns in a file so you can include a specific list of files in a transfer. Then exclude that list from future transfers.

#!/usr/bin/env bash

set -e

track_dir=~/.track_xfer
inc_file="$track_dir/include_files"
exc_file="$track_dir/exclude_files"
xfer_dir=~/testrsync
xfer_dest=~/testrsync_dest

mkdir -p "$track_dir"
touch $exc_file
cd "$xfer_dir"

# find files and create rsync filter list
find . -type f -print0 | perl -e '
  $/="\0"; 
  while (<>){ 
   chomp; 
   $_ =~ s!^\.!!;    # remove leading .
   $f = quotemeta;   # quote special chars
   $f =~ s!\\/!/!g;  # fix quoted paths `/`
   print $f."\n"; 
  }' > "$inc_file"

# Run the rsync
rsync -va --delete --exclude-from "$exc_file" --include-from "$inc_file" "$xfer_dir/" "$xfer_dest"

# Add the included/transferred files to the exclusion list
cat "$inc_file" "$exc_file" > "$exc_file".tmp
sort "$exc_file".tmp | uniq > "$exc_file"

You might need some more rsync specific regex quoting but the Perl quotemeta function and replacements was the first easy solution that came to mind.

The main problem will be dealing with any special characters in files names. If you want to deal with new lines or tabs and other strange things in the names then you will have to put a bit more work into the perl (or whatever) that parses and generates the inclusion pattern list. If you can restrict the names of your transfer files to a simple character set then you don't need to worry about this step as much. The perl is a halfway solution that should get you past most common regex chars.

The reason for using the include list rather than letting rsync pull the whole directory it self is so that you have a defined/complete list of files for the subsequent exclude list. You could probably achieve the same result by parsing the rsync output or a --log-file=FILE for the files that were transferred but that looked a little harder.