In all shells I am aware of, rm [A-Z]*
removes all files that start with an uppercase letter, but with bash this removes all files that start with a letter.
As this problem exists on Linux and Solaris with bash-3 and bash-4, it cannot be a bug caused by a buggy pattern matcher in libc or a miss-configured locale definition.
Is this strange and risky behavior intended or is this just a bug that exists unfixed since many years?
Best Answer
LC_COLLATE
is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.Consider the following:
Notice when the command
echo [a-z]
is called, the expected output would be all files with lower case characters. Also, withecho [A-Z]
, files with uppercase characters would be expected.Standard collations with locales such as
en_US
have the following order:a
andz
(in[a-z]
) are ALL uppercase letters, except forZ
.A
andZ
(in[A-Z]
) are ALL lowercase letters, except fora
.See:
If you change the
LC_COLLATE
variable toC
it looks as expected:So, it's not a bug, it's a collation issue.
Instead of range expressions you can use POSIX defined character classes, such as
upper
orlower
. They work also with differentLC_COLLATE
configurations and even with accented characters: