Shell – Any way to sync directory structure when the files are already on both sides

directoryshellsynchronization

I have two drives with the same files, but the directory structure is totally different.

Is there any way to 'move' all the files on the destination side so that they match the structure of the source side? With a script perhaps?

For example, drive A has:

/foo/bar/123.txt
/foo/bar/234.txt
/foo/bar/dir/567.txt

Whereas drive B has:

/some/other/path/123.txt
/bar/doo2/wow/234.txt
/bar/doo/567.txt

The files in question are huge (800GB), so I don't want to re-copy them; I just want to sync the structure by creating the necessary directories and moving the files.

I was thinking of a recursive script that would find each source file on the destination, then move it to a matching directory, creating it if necessary. But — that's beyond my abilities!

Another elegant solution was given here:
https://superuser.com/questions/237387/any-way-to-sync-directory-structure-when-the-files-are-already-on-both-sides/238086

Best Answer

I'll go with Gilles and point you to Unison as suggested by hasen j. Unison was DropBox 20 years before DropBox. Rock solid code that a lot of people (myself included) use every day -- very worthwhile to learn. Still, join needs all the publicity it can get :)


This is only half an answer, but I have to get back to work :)

Basically, I wanted to demonstrate the little-known join utility which does just that: joins two tables on a some field.

First, set up a test case including file names with spaces:

for d in a b 'c c'; do mkdir -p "old/$d"; echo $RANDOM > "old/${d}/${d}.txt"; done
cp -r old new

(edit some directory and/or file names in new).

Now, we want to build a map: hash -> filename for each directory and then use join to match up files with the same hash. To generate the map, put the following in makemap.sh:

find "$1" -type f -exec md5 -r "{}" \; \
  | sed "s/\([a-z0-9]*\) ${1}\/\(.*\)/\1 \"\2\"/" \

makemap.sh spits out a file with lines of the form, 'hash "filename"', so we just join on the first column:

join <(./makemap.sh 'old') <(./makemap.sh 'new') >moves.txt

This generates moves.txt which looks like this:

49787681dd7fcc685372784915855431 "a/a.txt" "bar/a.txt"
bfdaa3e91029d31610739d552ede0c26 "c c/c c.txt" "c c/c c.txt"

The next step would be to actually do the moves, but my attempts got stuck on quoting... mv -i and mkdir -p should come handy.

Related Question