I'm trying to determine the root cause of why this find
command is not working; it shouldn't match the file called this_should_not_match
below:
$ > find . -type f -name "*[^ -~]*"
./__º╚t
./this_should_not_match
./__╞_u
./__¡VW
./__▀√Z
./__εè_
./__∙Σ_
./__Σ_9
./__Σhm
./__φY_
My shell is Bash 3.2
Best Answer
Ranges only work reliably and portably in the C locale. In other locales, you get some variation, but generally
[x-y]
gets you the characters (actually collating elements, it could even match sequences of characters) that sort afterx
and beforey
in some sort order which is often obscure and not always the same assort
would use.In the C locale (see What does “LC_ALL=C” do?), characters are bytes and ranges are based on the code point of the characters (on byte values).
on an ASCII-based system (most of them; POSIX doesn't guarantee the C locale to use ASCII charset, but in practice, unless you're on some EBCDIC based special IBM mainframe OS (but then you'd know about it), you'll be using ASCII) would list regular files whose name contains bytes other than those between 32 and 126.
Also note that in a multi-byte character locale (like UTF-8 ones, the norm nowadays), the
*
may not even match all file names as on some systems, it will fail to match sequences of bytes that don't form valid characters.