Shell Scripting – Read/Write to Same File Descriptor with Redirection

io-redirectionshellshell-script

I'm trying to understand file descriptors in the context of shell redirection.

Why can't I have cat read from FD 3, which is being written to by ls's STDOUT?

{ err=$(exec 2>&1 >&3; ls -ld /x /bin); exec 0<&3; out=$(cat); } 3>&1;

When try this, cat still wants to read from my keyboard.

If this can't be done, why not?

Differentiation: This question is about reading / writing to the same file descriptor, using the problem presented by Redirect STDERR and STDOUT to different variables without temporary files
as an example.

Best Answer

{ err=$(exec 2>&1 >&3; ls -ld /x /bin); exec 0<&3; out=$(cat); } 3>&1

The { ... } 3>&1 clones the fd 1 to fd 3. That just means that fd 3 now points to the same resource (same open file description) as what fd 1 pointed to. If you ran that from a terminal, that will probably be a fd open in read+write mode to a terminal device.

After exec 0<&3, fds 0, 1, and 3 are all pointing to that same open file description (created when your terminal emulator opened the slave side of the pseudo-terminal pair it created before executing your shell in the case of the command run in the terminal case above).

Then in out=$(cat), for the process executing cat the $(...) changes fd 1 to the writing end of a pipe, while 0 is still the tty device. So cat will read from the terminal device, so things you're typing on the keyboard (and if it wasn't a terminal device, you would probably get an error as the fd was probably open in write-only mode).

For cat to read what ls writes on its stdout, you'd need ls stdout and cat stdin to be two ends on an IPC mechanism like pipe, socketpair or pseudo-terminal pair. For instance ls stdout to be the writing end of a pipe and cat stdin to be the reading end.

But you'd also need ls and cat to run concurrently, not one after the other, as that's an IPC (inter-process communication) mechanism.

Since pipes can hold some data (64 KiB by default on current versions of Linux), you would get away with short outputs if you managed to create that second pipe, but for larger outputs, you'd run into deadlocks, ls would hang when the pipe is full and would hang until something empties the pipe, but cat can only empty the pipe when ls returns.

Also, only yash has a raw interface to pipe() which you'd need to create that second pipe to read from ls stdout (the other pipe for stderr being created by the $(...) construct).

In yash, you'd do:

{ out=$(ls -d / /x 2>&3); exec 3>&-; err=$(exec cat <&4); } 3>>|4

Where 3>>|4 (a yash-specific feature) creates that second pipe with the writing end on fd 3 and the reading end on fd 4.

But again, if the stderr output is greater than the size of the pipe, that will hang. We're effectively using the pipe as a temporary file in memory, not a pipe.

To really use pipes, we'd need to start ls with stdout being the writing end of one pipe and stderr being the writing end of another pipe, and then the shell read the other ends of those pipes concurrently, as the data comes (not one after the other or again you'd run into dead-locks) to store into the two variables.

To be able to read from those two fds as the data comes, you'd need a shell with select()/poll() support. zsh is such a shell, but it doesn't have yash's pipeline redirection feature¹, so you'd need to use named pipes (so manage their creation, permissions, and cleanup) and use a complex loop with zselect/sysread...

¹ If on Linux though, you would be able to use the fact that /proc/self/fd/x on a pipe behaves like a named pipe though, so you could do:

#! /bin/zsh
zmodload zsh/zselect
zmodload zsh/system

(){exec {wo}>$1 {ro}<$1} <(:) # like yash's wo>>|ro (but on Linux only)
(){exec {we}>$1 {re}<$1} <(:)

ls -d / /x >&$wo 2>&$we &
exec {wo}>&- {we}>&-
out= err=
o_done=0 e_done=0

while ((! (o_done && e_done))) && zselect -A ready $ro $re; do
  if ((${#ready[$ro]})); then
    sysread -i $ro && out+=$REPLY || o_done=1
  fi
  if ((${#ready[$re]})); then
    sysread -i $re && err+=$REPLY || e_done=1
  fi
done

Related Solutions

Bash IO Redirection – How Can a File Redirected for Input Be Written To and Can It Be Prevented?

That's due to the way /dev/stdin (actually /proc/self/fd/0) is implemented on Linux (and Cygwin, but generally not other systems).

On Linux opening /dev/stdin is not like doing a dup(0), it just reopens the same file as open on fd 0 anew. It doesn't share the open file description that fd 0 refers to (with the readonly mode), but gets a completely unrelated new open file description, with the mode as specified in open().

So if sops -d /dev/stdin opens /dev/stdin in read+write mode and fd 0 was open in read-only on /some/file, /some/file will be open in read+write.

Effectively, cmd /dev/stdin < file there is the same as cmd file < file. You'll find that /dev/stdin is just a symlink¹ to file:

/tmp$ namei -l /dev/stdin < file
f: /dev/stdin
drwxr-xr-x root     root     /
drwxr-xr-x root     root     dev
lrwxrwxrwx root     root     stdin -> /proc/self/fd/0
drwxr-xr-x root     root       /
dr-xr-xr-x root     root       proc
lrwxrwxrwx root     root       self -> 73569
dr-xr-xr-x stephane stephane     73569
dr-x------ stephane stephane   fd
lr-x------ stephane stephane   0 -> /tmp/file
drwxr-xr-x root     root         /
drwxrwxrwt root     root         tmp
-rw-r--r-- stephane stephane     file

It can get worse. If it was opening with O_TRUNCATE, the file would be truncated. If fd 0 was pointing to the reading end of a pipe and /dev/stdin was open in write-only mode, you'd get the other end of the pipe.

But using:

cat file | cmd /dev/stdin

Would guard against cmd overwriting file as all cmd would see would be the pipe. And even if it did open in write-only mode, it couldn't get back to the file, it would just get to the writing end of the pipe and the only file descriptor on the reading end would be cmd's stdin.

Other OSes don't have the problem as opening /dev/stdin there is like doing a dup(0), so you get the same open file description and if you open with an incompatible mode, the open() system call just fails.

^{¹ technically, as noted by @user414777 in comments, /proc/<pid>/fd/<fd> are magic symlinks in that for instance they can reach into places that normal symlinks could not, but when it comes to opening them, past the path resolution stage, they act like normal symlinks in that you just open the target file}

Best Answer

Related Solutions

Bash IO Redirection – How Can a File Redirected for Input Be Written To and Can It Be Prevented?

Related Question