Best practice to continue mv

data-recoverymv

I used the terminal to copy files from one drive to another.

sudo mv -vi /location/to/drive1/ /location/to/drive2/

However that suddenly stopped, while some hours into it, and without an error, after creating a directory.

My own solution to that is often a mix of hashing and comparing which is mostly a time consuming mess as I now have to recover from an intermediate copy without really knowing which files are missing (written as very long one-liner for zsh — note that this script doesn't work in bash as written):

source_directory="/path/to/source_directory/";
target_directory="/path/to/target_directory/";
while read hash_and_file; do {
    echo "${hash_and_file}" | read hash file;
    echo "${file}" | sed "s/^/${source_directory}/g" | read copy_from;
    echo "${copy_from}" | sed "s/${source_directory}/${target_directory}/g" | read copy_to;
    mv -v "${copy_from}" "${copy_to}" | tee -a log;
    rm -v "${copy_from}" | tee -a log; };
done <<<$(
    comm -23 <( find ${source_directory} -type f -exec sha256sum "{}" \; |
                sed "s: ${source_directory}: :g" | sort;
           ) <( find ${target_directory} -type f -exec sha256sum "{}" \; |
                sed "s: ${target_directory}: :g" | sort; ) )

This is error prone if the name target directory or source_directory are part of the path, and delete files if they have not been moved because they were marked as duplicates. Also it does not source directory in the end.

Is there a best practice how to recover from interrupted mv?

Best Answer

Forget about trying to reinvent rsync, and use rsync.

sudo rsync -av /location/to/drive1/ /location/to/drive2/

Make sure you use a trailing slash on the source, otherwise it would copy to /location/to/drive2/drive1.

Double-check that the command succeeded, then run rm -rf /location/to/drive1/.

The command above will overwrite any preexisting file from drive2. If you want to prompt the user to skip files that already existed in drive2, as with mv -i, it's more complicated, because you now need to distinguish files that have already been copied and files that haven't. You can pass the --ignore-existing option to rsync to skip files that already exist on the destination regardless of their content. Note that if the original mv was interrupted in the middle of creating a file, this file will remain in its half-copied state (whereas a bare rsync -a would properly finish copying it).

If you want to reproduce the exact behavior of mv -i, including the prompting, it could be done, but it's a lot more complicated.

Note that your one-giant-liner is very fragile. If there are file names containing backslashes or newlines, they may not be copied properly or they may even trick your script into removing arbitrary files. So do not use the code in the question unless you're sure that you can trust the file names not to contain backslashes or newlines.

For future reference, I recommend to never use mv for large cross-drive moves, precisely because it's hard to control what happens if it gets interrupted. Use rsync to do the copying, and then remove the original.

Related Question