Find and rsync both choke on oddly named file

filenamesfindlsrsync

This isn't an important issue to me, but I thought rsync and find were fairly robust, so I was surprised when rsync choked on a file, and then find did too. ls -l shows the file has 6093 bytes (and it's the only file in that directory that does, so I did this after cd'ing to that directory):

# find . -size 6093c
./????????????????????????:??????????????????????????????????????????
find: './\353\266\204\353\245\230:\353\257\270\352\265\255\354\235\230_\355\205\224\353\240\210\353\271\204\354\240\204_\352\262\214\354\236\204_\354\207\274': No such file or directory

Any idea what this means? Bizarrely,

# find . -size 6093c | xargs less

works fine. Here's what ls sees:

# ls -lat | fgrep "6093 "
ls: cannot access ''$'\353\266\204\353\245\230'':'$'\353\257\270\352\265\255\354\235\230''_'$'\355\205\224\353\240\210\353\271\204\354\240\204''_'$'\352\262\214\354\236\204''_'$'\354\207\274': No such file or directory
-rw-rw-r--. 1 nobody nobody   6093 Oct 23  2013 หมวà¸à¸«à¸¡à¸¹à¹:à¹à¸à¸¡à¹à¸à¸§à¹à¸­à¹à¸¡à¸£à¸´à¸à¸²

It gets only slightly better if I pipe the results to less:

# ls -lat | fgrep "6093 " | less

ls: cannot access ''$'\353\266\204\353\245\230'':'$'\353\257\270\352\265\255\354\235\230''_'$'\355\205\224\353\240\210\353\271\204\354\240\204''_'$'\352\262\214\354\236\204''_'$'\354\207\274': No such file or directory
-rw-rw-r--. 1 nobody nobody   6093 Oct 23  2013 <E0><B8><AB><E0><B8><A1><E0><B8><A7><E0><B8><94><E0><B8><AB><E0><B8><A1><E0><B8><B9><E0><B9><88>:<E0><B9><80>
<E0><B8><81><E0><B8><A1><E0><B9><82><E0><B8><8A><E0><B8><A7><E0><B9><8C><E0><B8><AD><E0><B9><80><E0><B8><A1><E0><B8><A3><E0><B8><B4><E0><B8><81><E0><B8><B2>

The same directory has a file even ls can't handle, but I can list it since it kind of sort of shows up as the oldest entry:

# ls -lat | tail -1 | less
ls: cannot access ''$'\353\266\204\353\245\230'':'$'\353\257\270\352\265\255\354\235\230''_'$'\355\205\224\353\240\210\353\271\204\354\240\204''_'$'\352\262\214\354\236\204''_'$'\354\207\274': No such file or directory
-?????????? ? ?      ?           ?            ? <EB><B6><84><EB><A5><98>:<EB>
<AF><B8><EA><B5><AD><EC><9D><98>_<ED><85><94><EB><A0><88><EB><B9><84><EC><A0>
<84>_<EA><B2><8C><EC><9E><84>_<EC><87><BC>

Not super important, but sort of a curiosity.

EDIT: since this question seems have to drawn a lot of attention quickly, I did a little "research" (which may or may not be entirely accurate). I was not quite able to replicate the issue, but:

Best Answer

The oddly named file might be a red herring. Your tools are performing several tricks to make you think something is broken.

The filename is in UTF-8, so you should export LANG=en_US.UTF-8 to allow your commands to use the filename without friction. Run the locale command with no arguments to verify the current environment variables.

Or, if you insist on using "C" locale, use ls -b to have ls print escape sequences instead of question marks. Then you can use $'\353\266\204\…' as an argument in bash.

The find command cowardly refuses to write non-textual characters to a tty. In other words, find and find | cat behave differently, with the latter writing the names unquoted, so that's why find | xargs is indeed working. A more robust way to write that is find -print0 | xargs -0 to prevent possible whitespace characters from being interpreted by xargs.

This doesn't explain the "No such file or directory" — by all means, your filesystem may indeed contain an error — but it should allow you to state your intent to the command line more precisely.

At first I did not think it was relevant, but I am in the habit of prefixing strange file names with ./ to prevent commands from interpreting them. The sidebar showed a related question "rsync: colon in file names" which might be the root cause of your rsync error.

Related Question