Shell – Do Redirection Operators Always Open File Descriptors in Parallel

io-redirectionshell

1. Consider snippet#1:

$ cat test.txt > test.txt
cat: test.txt: input file is output file

It seems that cat makes its input file descriptor point to test.txt and then when it tries to set its output file descriptor to test.txt it throws the above error. Here it seems that cat is aware of the redirect operator and so proceeds to attempt to set the output file descriptor to test.txt

2. Consider snippet#2:

$ cat 1.txt
1:CAT
2:dog
$ sed 's/cat/CAT/g' test.txt
1:CAT
2:dog
$ sed 's/cat/CAT/g' test.txt > test.txt
$ cat test.txt # Note that test.txt is now empty
$

Here we see that sed opens test.txt (last argument) in read mode and at the same time sets test.txt as its file output descriptor. Also the '>' operator overwrites the contents of the file BEFORE sed starts to read from it.

I am aware that commands in a pipeline execute in parallel but have not come across any info on how redirection operators behave. Any supporting links would be helpful.

Best Answer

In addition to the documentation jordanm points to, I want to make sure to correct a misconception illustrated in your question—the executed program does not handle redirects. It is barely even aware of them. The shell handles redirects.

A program is started with three files open: stdin (#0), stdout (#1), and stderr (#2). If you just run a program from your shell prompt, these will be connected to your terminal device, so the program reads what you type (stdin), and prints output (stdout) and errors (stderr) to your terminal.

As an example, I just run cat in a terminal (which tty says is /dev/pts/31). I can check which files it has open with lsof:

$ lsof -a -p `pidof cat` -d0,1,2
COMMAND   PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
cat     21257 anthony    0u   CHR 136,31      0t0   34 /dev/pts/31
cat     21257 anthony    1u   CHR 136,31      0t0   34 /dev/pts/31
cat     21257 anthony    2u   CHR 136,31      0t0   34 /dev/pts/31

Indeed, we can see that it has the terminal open for all three. Now, instead, let's try a rather silly cat invocation: cat < /dev/zero > /dev/null 2>/dev/full, which is redirecting all three:

COMMAND   PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
cat     21838 anthony    0r   CHR    1,5      0t0 1030 /dev/zero
cat     21838 anthony    1w   CHR    1,3      0t0 1028 /dev/null
cat     21838 anthony    2w   CHR    1,7      0t0 1031 /dev/full

The shell implemented those redirections by passing the three devices as stdin, stdout, and stderr (instead of the terminal). The shell similarly implements pipes. Let's try cat | dd > /dev/null (a rather silly pipe, indeed):

COMMAND   PID    USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
cat     22507 anthony    0u   CHR 136,31      0t0       34 /dev/pts/31
cat     22507 anthony    1w  FIFO    0,8      0t0 56081395 pipe
cat     22507 anthony    2u   CHR 136,31      0t0       34 /dev/pts/31

COMMAND   PID    USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
dd      22508 anthony    0r  FIFO    0,8      0t0 56081395 pipe
dd      22508 anthony    1u   CHR 136,31      0t0       34 /dev/null
dd      22508 anthony    2u   CHR 136,31      0t0       34 /dev/pts/31

Notice how the shell has opened a pipe, and it has used it to connect the stdout of cat to the stdin of dd. And further how it has connected dd's stdout to /dev/null.

The commands being run aren't really aware of the redirections. They just use stdin, stdout, stderr as normal. Those could all be the terminal, or they could be redirected to/from a file, a device, or maybe a pipe to another program. Or even a network socket, if your shell supports that.

Even the most ridiculously complicated pipelines are actually just instructions to the shell on how to connect those three file handles before executing the program.

(NOTE: Some programs behave differently in the case where one of those is attached to a terminal, but that's normally to be more user-friendly in interactive use For example, ls switches to single-column output and no color when stdout isn't a terminal—which is usually what you want if you're about to pass it to another program. Some programs handle prompting differently if stdin isn't a terminal. And so on.)