Bash – Why Does ls -d Also List Files and Where Is It Documented?

bashlsoptionsshellwildcards

when specifying ls --directory a* it should list only directories starting with a*
BUT it lists files AND directories starting with a

Questions:

where might I find some documentation on this, other than man and info where I think I thoroughly looked?
does this work in BASH only?

Best Answer

The a* and *a* syntax is implemented by the shell, not by the ls command.

When you type

ls a*

at your shell prompt, the shell expands a* to a list of existing all files in the current directory whose names start with a. For example, it might expand a* to the sequence a1 a2 a3, and pass those as arguments to ls. The ls command itself never sees the * character; it only sees the three arguments a1, a2, and a3.

For purposes of wildcard expansion, "files" refers to all entities in the current directory. For example, a1 might be a normal file, a2 might be a directory, and a3 might be a symlink. They all have directory entries, and the shell's wildcard expansion doesn't care what kind of entity those entries refer to.

Practically all shells you're likely to run across (bash, sh, ksh, zsh, csh, tcsh, ...) implement wildcards. The details may vary, but the basic syntax of * matching zero or more characters and ? matching any single character is reasonably consistent.

For bash in particular, this is documented in the "Filename expansion" section of the bash manual; run info bash and search for "Filename expansion", or see here.

The fact that this is done by the shell, and not by individual commands, has some interesting (and sometimes surprising) consequences. The best thing about it is that wildcard handling is consistent for (very nearly) all commands; if the shell didn't do this, inevitably some commands wouldn't bother, and others would do it in subtly different ways that the author thought was "better". (I think the Windows command shell has this problem, but I'm not familiar enough with it to comment further.)

On the other hand, it's difficult to write a command to rename multiple files. If you write:

mv *.log *.log.bak

it will probably fail, since*.log.bak is expanded based on the files that already exist in the current directory. There are commands that do this kind of thing, but they have to use their own syntax to specify how the files are to be renamed. Some commands (such as find) can do their own wildcard expansion; you have to quote the arguments to suppress the shell's expansion:

find . -name '*.txt' -print

The shell's wildcard expansion is based entirely on the syntax of the command-line argument and the set of existing files. It can't be affected by the meaning of the command. For example, if you want to move all .log files up to the parent directory, you can type:

mv *.log ..

If you forget the .. :

mv *.log

and there happen to be exactly two .log files in the current directory, it will expand to:

mv one.log two.log

which will rename one.log and clobber two.log.

EDIT: And after 52 upvotes, an accept, and a Guru badge, maybe I should actually answer the question in the title.

The -d or --directory option to ls doesn't tell it to list only directories. It tells it to list directories just as themselves, not their contents. If you give a directory name as an argument to ls, by default it will list the contents of the directory, since that's usually what you're interested in. The -d option tells it to list just the directory itself. This can be particularly useful when combined with wildcards. If you type:

ls -l a*

ls will give you a long listing of each file whose name starts with a, and of the contents of each directory whose name starts with a. If you just want a list of the files and directories, one line for each, you can use:

ls -ld a*

which is equivalent to:

ls -l -d a*

Remember again that the ls command never sees the * character.

As for where this is documented, man ls will show you the documentation for the ls command on just about any Unix-like system. On most Linux-based systems, the ls command is part of the GNU coreutils package; if you have the info command, either info ls or info coreutils ls should give you more definitive and comprehensive documentation. Other systems, such as MacOS, may use different versions of the ls command, and may not have the info command; for those systems, use man ls. And ls --help will show a relatively short usage message (117 lines on my system) if you're using the GNU coreutils implementation.

And yes, even experts need to consult the documentation now and then. See also this classic joke.

Because the manuals are wrong.

The '93 Korn shell is wrong, too.

The 1997 Single Unix Specification says:

If the directory operand does not begin with a slash (/) character, and the first component is not dot or dot-dot, cd will search for directory relative to each directory named in the CDPATH variable, in the order listed.

The 2016 Single Unix Specification says the same in a different, and slightly redundant, way:

3. If the directory operand begins with a <slash> character, set curpath to the operand and proceed to step 7.
4. If the first component of the directory operand is dot or dot-dot, proceed to step 6.
[…]
6. Set curpath to the directory operand.

None of the manuals mention the part about . and .., but that is what every shell apart from the '93 Korn shell is actually doing, despite what their manuals say:

% export CDPATH=/tmp:
% lksh -c 'cd wibble'
/tmp/wibble
% dash -c 'cd wibble'
/tmp/wibble
% posh -c 'cd wibble'
/tmp/wibble
% bash -c 'cd wibble'
/tmp/wibble
% mksh -c 'cd wibble'
/tmp/wibble
% zsh -c 'cd wibble ; pwd'
/tmp/wibble
%

Best Answer

Related Solutions

Bash – Using find to list all files under certain directory

Bash – Why does CDPATH not work as documented in the manuals

Because the manuals are wrong.

Related Question