I want to list files in a certain subdirectory, but I'm doing so as part of a docker exec
inside a docker container, so I don't want to bother starting up a shell that I don't really need. Is it possible to find all the matches for a glob with a simple command line tool, and not just a shell?
For example, my current invocation is bash -l -c 'echo /usr/local/conda-meta/*.json'
. Is it possible to simplify this using a commonly available tool, resulting in something like globber /usr/local/conda-meta/*.json
, which would be much simpler and lighter weight?
Best Answer
sh
is simple and commonly available.sh
is the tool that is invoked to parse command lines in things likesystem(cmdline)
in many languages. Many OSes including some GNU ones have stopped usingbash
(the GNU shell) to implementsh
for the reason that it has become too bloated to do just that simple thing of parsing command lines and interpreting POSIXsh
scripts.Your
bash -l -c 'echo /usr/local/conda-meta/*.json'
command line is possibly being interpreted by ash
invocation already. So possibly you can just do:directly. If not:
You could also use
find
here.find
doesn't do globbing but it can report file names that match patterns similar to shell ones.Or with some
find
implementations:(note that the
LC_ALL=C
needed here so that*
matches any sequence of bytes, not just those that are forming valid characters in the current locale, is a shell construct. If that command line is not interpreted by a shell, you may need to change it toenv LC_ALL=C find...
)Some differences with shell globs:
! -name '.*'
to exclude them)/usr/local/conda-meta/./file.json
.x*/y/../*z
are not easily translated (also note the differing behaviour with respect to symlinks to directories in that case).In any case, you can't use
echo
to output arbitrary data.My next question would be: what are you going to do with that output? With
echo
, you're outputting those file paths separated by SPC characters, and with myprintf
orfind
above, delimited by NL characters. BothNL
andSPC
are perfectly valid characters in file names, so those outputs are not post-processable reliable. You could use'%s\0'
instead of'%s\n'
(or usefind
's-print0
if supported), not suitable for display to a user, but post-processable.In terms of efficiency, comparing Ubuntu 20.04's
/bin/sh
(dash 0.5.10.2) with itsfind
(GNUfind
4.7.0).Startup time:
Globbing some
json
files:Even
bash
is hardly slower thanfind
here:Of course YMMV depending on the system, implementation, version of the respective utilities and the libraries they're linked against.
Now on the history note, the glob name actually comes from the name of a utility called
glob
in the very first versions of Unix in the early 70s. It was located in/etc
and was invoked bysh
as a helper to expand wildcard patterns.You'll find a few projects online to revive that very old shell such as https://etsh.nl/. More as an exercise in archaeology, you could build the
glob
utility from there and then be able to do:A few notes of warning though.
[!x]
(let alone[^x]
) is not supported.$'\xe9*'
would match the same thing asi*
,$'\xaa*'
would match on filenames that start with*
; the shell would set that 8th bit for the quoted characters before invokingglob
)[a-f]
match on byte value rather than collation order (in practice, that's generally an advantage IMO).No match
error (again, probably preferably, that's something that was broken by the Bourne shell in the late 70s).The
glob
functionality was later moved into the shell starting with the PWB shell and Bourne shell in the late 70s. Later, somefnmatch()
andglob()
functions were added to the C library to allow that feature to be used from other applications, but I'm not aware of a standard nor common utility that is a bare interface to that function. Evenperl
used to invokecsh
in its early days to expand glob patterns.