From man bash
:
Process Substitution
Process substitution is supported on systems that support
named pipes (FIFOs) or the /dev/fd method of naming open
files. It takes the form of <(list) or >(list). The process
list is run with its input or output connected to a FIFO or
some file in /dev/fd. The name of this file is passed as an
argument to the current command as the result of the expan‐
sion. If the >(list) form is used, writing to the file will
provide input for list. If the <(list) form is used, the
file passed as an argument should be read to obtain the out‐
put of list.
You can search manpages by pressing /
and then typing your search string, which is a good way of finding information like this. It does of course require that you know in which manpage to search :)
You have to quote the (
though, because it has a special meaning when searching. To find the relevant section in the bash manpage, type />\(
.
I am not at all convinced of this, but let's suppose for the sake of argument that you could, if you're prepared to put in enough effort, parse the output of ls
reliably, even in the face of an "adversary" — someone who knows the code you wrote and is deliberately choosing filenames designed to break it.
Even if you could do that, it would still be a bad idea.
Bourne shell is not a good language. It should not be used for anything complicated, unless extreme portability is more important than any other factor (e.g. autoconf
).
I claim that if you're faced with a problem where parsing the output of ls
seems like the path of least resistance for a shell script, that's a strong indication that whatever you are doing is too complicated for shell and you should rewrite the entire thing in Perl or Python. Here's your last program in Python:
import os, sys
for subdir, dirs, files in os.walk("."):
for f in dirs + files:
ino = os.lstat(os.path.join(subdir, f)).st_ino
sys.stdout.write("%d %s %s\n" % (ino, subdir, f))
This has no issues whatsoever with unusual characters in filenames -- the output is ambiguous in the same way the output of ls
is ambiguous, but that wouldn't matter in a "real" program (as opposed to a demo like this), which would use the result of os.path.join(subdir, f)
directly.
Equally important, and in stark contrast to the thing you wrote, it will still make sense six months from now, and it will be easy to modify when you need it to do something slightly different. By way of illustration, suppose you discover a need to exclude dotfiles and editor backups, and to process everything in alphabetical order by basename:
import os, sys
filelist = []
for subdir, dirs, files in os.walk("."):
for f in dirs + files:
if f[0] == '.' or f[-1] == '~': continue
lstat = os.lstat(os.path.join(subdir, f))
filelist.append((f, subdir, lstat.st_ino))
filelist.sort(key = lambda x: x[0])
for f, subdir, ino in filelist:
sys.stdout.write("%d %s %s\n" % (ino, subdir, f))
Best Answer
That started as a hack in the Bourne shell. In the Bourne shell, IFS word splitting was done (after tokenisation) on all words in list context (command line arguments or the words the
for
loops loop on). If you had:That second line would be tokenised in 3 words,
$var
would be expanded, and split+glob would be done on all three words, so you would end up runninged
witht
,f
,le.txt
,f
,le2.txt
as arguments.Quoting parts of that would prevent the split+glob. The Bourne shell initially remembered which characters were quoted by setting the 8th bit on them internally (that changed later when Unix became 8bit clean, but the shell still did something similar to remember which byte was quoted).
Both
$*
and$@
were the concatenation of the positional parameters with space in-between. But there was a special processing of$@
when inside double-quotes. If$1
containedfoo bar
and$2
containedbaz
,"$@"
would expand to:(with the
^
s above indicating which of the characters have the 8th bit set). Where the first space was quoted (had the 8th bit set) but not the second one (the one added in-between words).And it's the IFS splitting that takes care of separating the arguments (assuming the space character is in
$IFS
as it is by default). That's similar to how$*
was expanded in its predecessor the Mashey shell (itself based on the Thomson shell, while the Bourne shell was written from scratch).That explains why in the Bourne shell initially
"$@"
would expand to the empty string instead of nothing at all when the list of positional parameters was empty (you had to work around it with${1+"$@"}
), why it didn't keep the empty positional parameters and why"$@"
didn't work when$IFS
didn't contain the space character.The intention was to be able to pass the list of arguments verbatim to another command, but that didn't work properly for the empty list, for empty elements or when
$IFS
didn't contain space (the first two issues were eventually fixed in later versions).The Korn shell (on which the POSIX spec is based) changed that behaviour in a few ways:
edit
orfile.txt
in the example above)$*
and$@
are joined with the first character of$IFS
or space when$IFS
is empty except that for a quoted"$@"
, that joiner is unquoted like in the Bourne shell, and for a quoted"$*"
whenIFS
is empty, the positional parameters are appended without separator.${array[@]}
${array[*]}
reminiscent of Bourne's$*
and$@
but starting at indice 0 instead of 1, and sparse (more like associative arrays) which means$@
cannot really be treated as a ksh array (compare withcsh
/rc
/zsh
/fish
/yash
where$argv
/$*
are normal arrays)."$@"
when$#
is 0 now expands to nothing instead of the empty string,"$@"
works when$IFS
doesn't contain spaces except whenIFS
is empty. An unquoted$*
without wildcards expands to one argument (where the positional parameters are joined with space) when$IFS
is empty.ksh93 fixed the remaining few problems above. In ksh93,
$*
and$@
expands to the list of positional parameters, separated regardless of the value of$IFS
, and then further split+globbed+brace-expanded in list contexts,$*
joined with first byte (not character) of$IFS
,"$@"
in list contexts expands to the list of positional parameters, regardless of the value of$IFS
. In non-list context, like invar=$@
,$@
is joined with space regardless of the value of$IFS
.bash
's arrays are designed after the ksh ones. The differences are:$IFS
instead of for byte$*
when non-quoted in non-list context when$IFS
is empty.While the POSIX spec used to be pretty vague, it now more or less specifies the bash behaviour.
It's different from normal arrays in
ksh
orbash
in that:"${@:0}"
which includes$0
(not a positional parameter, and in functions gives you the name of the function or not depending on the shell and how the function was defined)).shift
can be used.In
zsh
oryash
where arrays are normal arrays (not sparse, indices start at one like in all other shells but ksh/bash),$*
is treated as a normal array.zsh
has$argv
as an alias for it (for compatibility withcsh
).$*
is the same as$argv
or${argv[*]}
(arguments joined with the first character of$IFS
but still separated out in list contexts)."$@"
like"${argv[@]}"
or"${*[@]}"}
undergoes the Korn-style special processing.