First, note that the syntax for closing is 5>&-
or 6<&-
, depending on whether the file descriptor is being read for writing or for reading. There seems to be a typo or formatting glitch in that blog post.
Here's the commented script.
exec 5>/tmp/foo # open /tmp/foo for writing, on fd 5
exec 6</tmp/bar # open /tmp/bar for reading, on fd 6
cat <&6 | # call cat, with its standard input connected to
# what is currently fd 6, i.e., /tmp/bar
while read a; do #
echo $a >&5 # write to fd 5, i.e., /tmp/foo
done #
There's no closing here. Because all the inputs and outputs are going to the same place in this simple example, the use of extra file descriptors is not necessary. You could write
cat </tmp/bar |
while read a; do
echo $a
done >/tmp/foo
Using explicit file descriptors becomes useful when you want to write to multiple files in turn. For example, consider a script that outputs data to a data output file and logging data to a log file and possibly error messages as well. That means three output channels: one for data, one for logs and one for errors. Since there are only two standard descriptors for output, a third is needed. You can call exec
to open the output files:
exec >data-file
exec 3>log-file
echo "first line of data"
echo "this is a log line" >&3
…
if something_bad_happens; then echo error message >&2; fi
exec >&- # close the data output file
echo "output file closed" >&3
The remark about efficiency comes in when you have a redirection in a loop, like this (assume the file is empty to begin with):
while …; do echo $a >>/tmp/bar; done
At each iteration, the program opens /tmp/bar
, seeks to the end of the file, appends some data and closes the file. It is more efficient to open the file once and for all:
while …; do echo $a; done >/tmp/bar
When there are multiple redirections happening at different times, calling exec
to perform redirections rather than wrapping a block in a redirection becomes useful.
exec >/tmp/bar
while …; do echo $a; done
You'll find several other examples of redirection by browsing the io-redirection
tag on this site.
3>&4-
is a ksh93 extension also supported by bash and that is short for 3>&4 4>&-
, that is 3 now points to where 4 used to, and 4 is now closed, so what was pointed to by 4 has now moved to 3.
Typical usage would be in cases where you've duplicated stdin
or stdout
to save a copy of it and want to restore it, like in:
Suppose you want to capture the stderr of a command (and stderr only) while leaving stdout alone in a variable.
Command substitution var=$(cmd)
, creates a pipe. The writing end of the pipe becomes cmd
's stdout (file descriptor 1) and the other end is read by the shell to fill up the variable.
Now, if you want stderr
to go to the variable, you could do: var=$(cmd 2>&1)
. Now both fd 1 (stdout) and 2 (stderr) go to the pipe (and eventually to the variable), which is only half of what we want.
If we do var=$(cmd 2>&1-)
(short for var=$(cmd 2>&1 >&-
), now only cmd
's stderr goes to the pipe, but fd 1 is closed. If cmd
tries to write any output, that would return with a EBADF
error, if it opens a file, it will get the first free fd and the open file will be assigned it to stdout
unless the command guards against that! Not what we want either.
If we want the stdout of cmd
to be left alone, that is to point to the same resource that it pointed to outside the command substitution, then we need somehow to bring that resource inside the command substitution. For that we can do a copy of stdout
outside the command substitution to take it inside.
{
var=$(cmd)
} 3>&1
Which is a cleaner way to write:
exec 3>&1
var=$(cmd)
exec 3>&-
(which also has the benefit of restoring fd 3 instead of closing it in the end).
Then upon the {
(or the exec 3>&1
) and up to the }
, both fd 1 and 3 point to the same resource fd 1 pointed to initially. fd 3 will also point to that resource inside the command substitution (command substitution only redirects the fd 1, stdout). So above, for cmd
, we've got for fds 1, 2, 3:
- the pipe to var
- untouched
- same as what 1 points to outside the command substitution
If we change it to:
{
var=$(cmd 2>&1 >&3)
} 3>&1-
Then it becomes:
- same as what 1 points to outside the command substitution
- the pipe to var
- same as what 1 points to outside the command substitution
Now, we've got what we wanted: stderr goes to the pipe and stdout is left untouched. However, we're leaking that fd 3 to cmd
.
While commands (by convention) assume fds 0 to 2 to be open and be standard input, output and error, they don't assume anything of other fds. Most likely they will leave that fd 3 untouched. If they need another file descriptor, they'll just do an open()/dup()/socket()...
which will return the first available file descriptor. If (like a shell script that does exec 3>&1
) they need to use that fd
specifically, they will first assign it to something (and in that process, the resource held by our fd 3 will be released by that process).
It's good practice to close that fd 3 since cmd
doesn't make use of it, but it's no big deal if we leave it assigned before we call cmd
. The problems may be: that cmd
(and potentially other processes that it spawns) has one fewer fd available to it. A potentially more serious problem is if the resource that that fd points to may end up held by a process spawned by that cmd
in background. It can be a concern if that resource is a pipe or other inter-process communication channel (like when your script is being run as script_output=$(your-script)
), as that will mean the process reading from the other end will never see end-of-file until that background process terminates.
So here, it's better to write:
{
var=$(cmd 2>&1 >&3 3>&-)
} 3>&1
Which, with bash
can be shorten to:
{
var=$(cmd 2>&1 >&3-)
} 3>&1
To sum up the reasons why it's rarely used:
- it's non-standard and just syntactic sugar. You've got to balance saving a few keystrokes with making your script less portable and less obvious to people not used to that uncommon feature.
- The need to close the original fd after duplicating it is often overlooked because most of the time, we don't suffer from the consequence, so we just do
>&3
instead of >&3-
or >&3 3>&-
.
Proof that it's rarely used, as you found out is that it is bogus in bash. In bash compound-command 3>&4-
or any-builtin 3>&4-
leaves fd 4 closed even after compound-command
or any-builtin
has returned. A patch to fix the issue is now (2013-02-19) available.
Best Answer
It doesn't matter because both
4>&1
and4<&1
do the same thing:dup2(1, 4)
which is the system call to duplicate a fd onto another. The duplicated fd automatically inherits the I/O direction of the original fd. (same for4>&-
vs4<&-
which both resolve toclose(4)
, and4>&1-
which is thedup2(1, 4)
followed byclose(1)
).However, the
4<&1
syntax is confusing unless for some reason the fd 1 was explicitly open for reading (which would be even more confusing), so in my mind should be avoided.The duplicated
fd
shares the same open file description which means they share the same offset in the file (for those file types where it makes sense) and same associated flags (I/O redirection/opening mode, O_APPEND and so on).On Linux, there's another way to duplicate a
fd
(which is not really a duplication) and create a new open file description for the same resource but with possibly different flags.While on Solaris and probably most other Unices, that is more or less equivalent to
dup2(4, 3)
, on Linux, that opens the same resource as that pointed to by the fd 4 from scratch.That is an important difference, because for instance, for a regular file, the offset of fd 3 will be 0 (the beginning of the file) and the file will be truncated (which is why for instance on Linux you need to write
tee -a /dev/stderr
instead oftee /dev/stderr
).And the I/O mode can be different.
Interestingly, if fd 4 pointed to the reading end of a pipe, then fd 3 now points to the writing end (
/dev/fd/3
behaves like a named pipe):