Rsync keeps copying the same file

filesystemrsyncutf-8

I have used rsync for many years between linux filesystems without any problem with a very deterministic behavior. Now I try to use it on mac and experience some problem that some files (a small proportion of my 600000 files) gets copied over and over.

Here is what I tried:

  1. Moving from MacOS rsync 2.6.9 to rsync 3.1.3 (from brew) did not solve the problem.

  2. Giving some large time interval in order to avoid possible clock skew did not resolve the problem.

  3. I saw in some other answers that a possible problem is the utf-8 vs utf-8-mac characters. I understand that it can be a problem but I am ready to accept ascii filenames for the purpose of having linux and mac coexisting. The files in question have ASCII file names.

The weird think about it is that this unexpected behavior is completely deterministic. The same file ListDebug/ForDEBUG gets copied over and over. Yet the file ListDebug/ForDEBUG2 which is next to it was not copied over and over.

Any indication on the origin of this strange behavior would be much welcomed.

EDIT: I found some more info when copying only the directory ListDebug.
When I run rsync -vadi -e ssh remote:ListDebug . I get

mathieu@MacBook-Pro: rsync -vadi -e ssh remote:ListDebug .
>f.st......... ListDebug/ForDebug
mathieu@MacBook-Pro: rsync -vadi -e ssh remote:ListDebug .
>f.st......... ListDebug/ForDEBUG
mathieu@MacBook-Pro: rsync -vadi -e ssh remote:ListDebug .
>f.st......... ListDebug/ForDebug

Thus there is some oscillation between one print and another. That really sounds strange and a bug.

EDIT2: The file ListDebug/ForDebug gets copied but become named ListDebug/ForDEBUG and the file ForDEBUG never gets copied.

EDIT3: If I change the content of ForDebug/ForDEBUG to something standard like TEST1, TEST2 then the bug remains. On the other hand if I rename the files ForDebug/ForDEBUG to file1/file2 then the bug disappear. Ve

Best Answer

To summarize the comments: rsync does not handle the transfer of files between a case-sensitive filesystem (typically used on Unix/Linux), and a case-insensitive filesystem (typically used by Windows and MacOS).

When 2 different source paths (eg d/x and d/X) are the same after notional conversion to, say, lowercase, then rsync does not notice, and may transfer d/x, then overwrite the same destination file with d/X. If the files do not contain the same data, and have the same timestamp, the files will always be updated on future rsync runs.

There does not seem to be any solution proposed by rsync. One can identify potential problems by going through the source files to list the ambiguities:

find . | tr '[:upper:]' '[:lower:]' | LC_ALL=C sort | LC_ALL=C uniq -d

Alternatively, a case-sensitive destination can be created.