What's the difference between <<
, <<<
and < <
in bash?
Ubuntu – What’s the difference between <<, <<< and < < in bash
bashcommand lineredirect
Related Question
- Ubuntu – the difference between “source” and “.”
- Ubuntu – Environment variable vs Shell variable, what’s the difference
- Ubuntu – the difference between “source x”, “. x” and “./x” in Bash
- Ubuntu – What’s the difference between “purge” and “remove –purge”
- Ubuntu – What’s the difference between set, export and env and when should I use each
- Ubuntu – What’s the difference between .bashrc and /etc/bash.bashrc
- Ubuntu – What’s the difference between /etc/cron.hourly and crontab -e
Best Answer
Here document
<<
is known ashere-document
structure. You let the program know what will be the ending text, and whenever that delimiter is seen, the program will read all the stuff you've given to the program as input and perform a task upon it.Here's what I mean:
In this example we tell
wc
program to wait forEOF
string, then type in five words, and then type inEOF
to signal that we're done giving input. In effect, it's similar to runningwc
by itself, typing in words, then pressing CtrlDIn bash these are implemented via temp files, usually in the form
/tmp/sh-thd.<random string>
, while in dash they are implemented as anonymous pipes. This can be observed via tracing system calls withstrace
command. Replacebash
withsh
to see how/bin/sh
performs this redirection.Here string
<<<
is known ashere-string
. Instead of typing in text, you give a pre-made string of text to a program. For example, with such program asbc
we can dobc <<< 5*4
to just get output for that specific case, no need to run bc interactively.Here-strings in bash are implemented via temporary files, usually in the format
/tmp/sh-thd.<random string>
, which are later unlinked, thus making them occupy some memory space temporarily but not show up in the list of/tmp
directory entries, and effectively exist as anonymous files, which may still be referenced via file descriptor by the shell itself, and that file descriptor being inherited by the command and later duplicated onto file descriptor 0 (stdin) viadup2()
function. This can be observed viaAnd via tracing syscalls (output shortened for readability; notice how temp file is opened as fd 3, data written to it, then it is re-opened with
O_RDONLY
flag as fd 4 and later unlinked, thendup2()
onto fd 0, which is inherited bycat
later ):Opinion: potentially because here strings make use of temporary text files, it is the possible reason why here-strings always insert a trailing newline, since text file by POSIX definition has to have lines that end in newline character.
Process Substitution
As tldp.org explains,
So in effect this is similar to piping stdout of one command to the other , e.g.
echo foobar barfoo | wc
. But notice: in the bash manpage you will see that it is denoted as<(list)
. So basically you can redirect output of multiple (!) commands.Note: technically when you say
< <
you aren't referring to one thing, but two redirection with single<
and process redirection of output from<( . . .)
.Now what happens if we do just process substitution?
As you can see, the shell creates temporary file descriptor
/dev/fd/63
where the output goes (which according to Gilles's answer, is an anonymous pipe). That means<
redirects that file descriptor as input into a command.So very simple example would be to make process substitution of output from two echo commands into wc:
So here we make shell create a file descriptor for all the output that happens in the parenthesis and redirect that as input to
wc
.As expected, wc receives that stream from two echo commands, which by itself would output two lines, each having a word, and appropriately we have 2 words, 2 lines, and 6 characters plus two newlines counted.How is process substitution implemented ? We can find out using the trace below (output shortened for brevity)
The above trace on Ubuntu (which also implies on Linux in general) suggests that process substitution is implemented by repeatedly forking multiple subprocesses (so process 8953 forks multiple child processes 8954,8955,8956,etc). Then all these subprocesses communicate back via their stdout , but that is duplicated (that is copied) onto the stack of next available file descriptors starting at 63 downwards. Why start at 63 ? That may be a good question for the developers. It is known for a fact that
bash
can use fd 255 for saving file descriptors for the "main" command/pipeline when its streams are redirected.Side Note: Process substitution may be referred to as a bashism (a command or structure usable in advanced shells like
bash
, but not specified by POSIX), but it was implemented inksh
before bash's existence as ksh man page and this answer suggest. Shells liketcsh
andmksh
however do not have process substitution. So how could we go around redirecting output of multiple commands into another command without process substitution? Grouping plus piping!Effectively this is the same as above example, However, this is different under the hood from process substitution, since we make stdout of the whole subshell and stdin of
wc
linked with the pipe. On the other hand, process substitution makes a command read a temporary file descriptor.So if we can do grouping with piping, why do we need process substitution? Because sometimes we cannot use piping. Consider the example below - comparing outputs of two commands with
diff
(which needs two files, and in this case we are giving it two file descriptors)