The difference between [[ … ]]
and [ … ]
is mostly covered in Why does parameter expansion with spaces without quotes work inside double brackets "[[" but not inside single brackets "["?.
Crucially, [[ … ]]
is special syntax, whereas [
is a funny-looking name for a command. [[ … ]]
has special syntax rules for what's inside, [ … ]
doesn't.
With the added wrinkle of a wildcard, here's how [[ $a == z* ]]
is evaluated:
- Parse the command: this is the
[[ … ]]
conditional construct around the conditional expression $a == z*
.
- Parse the conditional expression: this is the
==
binary operator, with the operands $a
and z*
.
- Expand the first operand into the value of the variable
a
.
- Evaluate the
==
operator: test if the value of the variable a
matches the pattern z*
.
- Evaluate the conditional expression: its result is the result of the conditional operator.
- The command is now evaluated, its status is 0 if the conditional expression was true and 1 if it was false.
Here's how [ $a == z* ]
is evaluated:
- Parse the command: this is the
[
command with the arguments formed by evaluating the words $a
, ==
, z*
, ]
.
- Expand
$a
into the value of the variable a
.
- Perform word splitting and filename generation on the parameters of the command.
- For example, if the value of
a
is the 6-character string foo b*
(obtained by e.g. a='foo b*'
) and the list of files in the current directory is (bar
, baz
, qux
, zim
, zum
), then the result of the expansion is the following list of words: [
, foo
, bar
, baz
, ==
, zim
, zum
, ]
.
- Run the command
[
with the parameters obtained in the previous step.
- With the example values above, the
[
command complains of a syntax error and returns the status 2.
Note: In [[ $a == z* ]]
, at step 3, the value of a
does not undergo word splitting and filename generation, because it's in a context where a single word is expected (the left-hand argument of the conditional operator ==
). In most cases, if a single word makes sense at that position then variable expansion behaves like it does in double quotes. However, there's an exception to that rule: in [[ abc == $a ]]
, if the value of a
contains wildcards, then abc
is matched against the wildcard pattern. For example, if the value of a
is a*
then [[ abc == $a ]]
is true (because the wildcard *
coming from the unquoted expansion of $a
matches bc
) whereas [[ abc == "$a" ]]
is false (because the ordinary character *
coming from the quoted expansion of $a
does not match bc
). Inside [[ … ]]
, double quotes do not make a difference, except on the right-hand side of the string matching operators (=
, ==
, !=
and =~
).
$ touch ./-c $'a\n12\tb' foo
$ du -hs *
0 a
12 b
0 foo
0 total
As you can see, the -c
file was taken as an option to du
and is not reported (and you see the total
line because of du -c
). Also, the file called a\n12\tb
is making us think that there are files called a
and b
.
$ du -hs -- *
0 a
12 b
0 -c
0 foo
That's better. At least this time -c
is not taken as an option.
$ du -hs ./*
0 ./a
12 b
0 ./-c
0 ./foo
That's even better. The ./
prefix prevents -c
from being taken as an option and the absence of ./
before b
in the output indicates that there's no b
file in there, but there's a file with a newline character (but see below1 for further digressions on that).
It's good practice to use the ./
prefix when possible, and if not and for arbitrary data, you should always use:
cmd -- "$var"
or:
cmd -- $patterns
If cmd
doesn't support --
to mark the end of options, you should report it as a bug to its author (except when it's by choice and documented like for echo
).
There are cases where ./*
solves problems that --
doesn't. For instance:
awk -f file.awk -- *
fails if there is a file called a=b.txt
in the current directory (sets the awk variable a
to b.txt
instead of telling it to process the file).
awk -f file.awk ./*
Doesn't have the problem because ./a
is not a valid awk variable name, so ./a=b.txt
is not taken as a variable assignment.
cat -- * | wc -l
fails if there a file called -
in the current directory, as that tells cat
to read from its stdin (-
is special to most text processing utilities and to cd
/pushd
).
cat ./* | wc -l
is OK because ./-
is not special to cat
.
Things like:
grep -l -- foo *.txt | wc -l
to count the number of files that contain foo
are wrong because it assumes file names don't contain newline characters (wc -l
counts the newline characters, those output by grep
for each file and those in the filenames themselves). You should use instead:
grep -l foo ./*.txt | grep -c /
(counting the number of /
characters is more reliable as there can only be one per filename).
For recursive grep
, the equivalent trick is to use:
grep -rl foo .//. | grep -c //
./*
may have some unwanted side effects though.
cat ./*
adds two more character per file, so would make you reach the limit of the maximum size of arguments+environment sooner. And sometimes you don't want that ./
to be reported in the output. Like:
grep foo ./*
Would output:
./a.txt: foobar
instead of:
a.txt: foobar
Further digressions
1. I feel like I have to expand on that here, following the discussion in comments.
$ du -hs ./*
0 ./a
12 b
0 ./-c
0 ./foo
Above, that ./
marking the beginning of each file means we can clearly identify where each filename starts (at ./
) and where it ends (at the newline before the next ./
or the end of the output).
What that means is that the output of du ./*
, contrary to that of du -- *
) can be parsed reliably, albeit not that easily in a script.
When the output goes to a terminal though, there are plenty more ways a filename may fool you:
Control characters, escape sequences can affect the way things are displayed. For instance, \r
moves the cursor to the beginning of the line, \b
moves the cursor back, \e[C
forward (in most terminals)...
many characters are invisible on a terminal starting with the most obvious one: the space character.
There are Unicode characters that look just the same as the slash in most fonts
$ printf '\u002f \u2044 \u2215 \u2571 \u29F8\n'
/ ⁄ ∕ ╱ ⧸
(see how it goes in your browser).
An example:
$ touch x 'x ' $'y\bx' $'x\n0\t.\u2215x' $'y\r0\t.\e[Cx'
$ ln x y
$ du -hs ./*
0 ./x
0 ./x
0 ./x
0 .∕x
0 ./x
0 ./x
Lots of x
's but y
is missing.
Some tools like GNU
ls would replace the non-printable characters with a question mark (note that ∕
(U+2215) is printable though) when the output goes to a terminal. GNU du
does not.
There are ways to make them reveal themselves:
$ ls
x x x?0?.∕x y y?0?.?[Cx y?x
$ LC_ALL=C ls
x x?0?.???x x y y?x y?0?.?[Cx
See how ∕
turned to ???
after we told ls
that our character set was ASCII.
$ du -hs ./* | LC_ALL=C sed -n l
0\t./x$
0\t./x $
0\t./x$
0\t.\342\210\225x$
0\t./y\r0\t.\033[Cx$
0\t./y\bx$
$
marks the end of the line, so we can spot the "x"
vs "x "
, all non-printable characters and non-ASCII characters are represented by a backslash sequence (backslash itself would be represented with two backslashes) which means it is unambiguous. That was GNU sed
, it should be the same in all POSIX compliant sed
implementations but note that some old sed
implementations are not nearly as helpful.
$ du -hs ./* | cat -vte
0^I./x$
0^I./x $
0^I./x$
0^I.M-bM-^HM-^Ux$
(not standard but pretty common, also cat -A
with some implementations). That one is helpful and uses a different representation but is ambiguous ("^I"
and <TAB>
are displayed the same for instance).
$ du -hs ./* | od -vtc
0000000 0 \t . / x \n 0 \t . / x \n 0 \t .
0000020 / x \n 0 \t . 342 210 225 x \n 0 \t . / y
0000040 \r 0 \t . 033 [ C x \n 0 \t . / y \b x
0000060 \n
0000061
That one is standard and unambiguous (and consistent from implementation to implementation) but not as easy to read.
You'll notice that y
never showed up above. That's a completely unrelated issue with du -hs *
that has nothing to do with file names but should be noted: because du
reports disk usage, it doesn't report other links to a file already listed (not all du
implementations behave like that though when the hard links are listed on the command line).
Best Answer