Shell Wildcards – Why Nullglob is Not Default

shellwildcards

In most shells nullglob isn't the default. That means, for example, if you run this command

ls *

in an empty directory, it will expand the * glob to a literal *, instead to an empty list of arguments. There are ways to change that behaviour, so that * in an empty directory will return an empty list of arguments, which would seem more intuitive.

So, is there a reason why nullglob is disabled by default? If so, what is that reason?

Best Answer

The nullglob option (which BTW is a zsh invention, only added years later to bash (2.0)) would not be ideal in a number of cases. And ls is a good example:

ls *.txt

Or its more correct equivalent:

ls -- *.txt

With nullglob on would run ls with no argument which is treated as ls -- . (list the current directory) if no files match, which is probably worse than calling ls with a literal *.txt as argument.

You'd have similar problems with most text utilities:

grep foo *.txt

Would look for foo on stdin if there's no txt file.

A more sensible default, and the one of csh, tcsh, zsh or fish 2.3+ (and of early Unix shells) is to cancel the command altogether if the glob doesn't match.

bash (since version 3) has a failglob option for that (interesting to this discussion, since contrary to ash, AT&T ksh or zsh, bash doesn't support local scopes for options (though that's to change in 4.4), that option when enabled globally does break a few things like the bash-completion functions).

Note that csh and tcsh are slightly different from zsh, fish or bash -O failglob in cases like:

ls -- *.txt *.html

Where you need all the globs to not-match for the command to be cancelled. For instance, if there's one txt file and no html file, that becomes:

ls -- file.txt

You can get that behaviour with zsh with setopt cshnullglob though a more sensible way to do it in zsh would be to use a glob like:

ls -- *.(txt|html)

In zsh and ksh93, you can also apply nullglob on a per-glob basis, which is a lot saner approach than modifying a global setting:

files=(*.txt(N))  # zsh
files=(~(N)*.txt) # ksh93

would create an empty array if there's no txt file instead of failing the command with an error (or making it an array with one *.txt literal argument with other shells).

Versions of fish prior to 2.3 would work like bash -O nullglob but give a warning when interactive when a glob has no match. Since 2.3, it works like zsh except for globs used in for, set or count.

Now, on the history note, the behaviour was actually broken by the Bourne shell. In prior versions of Unix, globbing was done via the /etc/glob helper and that helper behaved like csh: it would fail the command if none of the globs matched any file and remove the globs with no match otherwise.

So the situation we're in today is due to a bad decision made in the Bourne shell.

Note that the Bourne shell (and the C shell) came with another new Unix feature: the environment. That meant variable expansion (it's predecessor only had the $1, $2... positional parameters). The Bourne shell also introduced command substitution.

Another poor design decision of the Bourne shell was to perform globbing (and splitting) upon the expansion of variables and command substitution (possibly for backward compatibility with the Thompson shell where echo $1 would still invoke /etc/glob if $1 contained wildcards (it was more like pre-processor macro expansion there, as in the expanded value was parsed again as shell code)).

Failing globs that don't match would mean for instance that:

pattern='a.*b'
grep $pattern file

would fail the command (unless there are some a.whateverb files in the current directory). csh (which also performs globbing upon variable expansion) does fail the command in that case (and I'd argue it's better than leaving a dormant bug there, even if it's not as good as not doing globbing at all like in zsh).