Bash – How does bash actually change stdin/stdout/stderr when using redirection/piping

bashfork

Unfortunately I've had no luck figuring this out, as everything I find is just on the syntax of redirection, or shallow information about how redirection works.

What I want to know is how bash actually changes stdin/stdout/stderr when you use pipes or redirection. If for example, you execute:

ls -la > diroutput.log

How does it change stdout of ls to diroutput.log?

I assume it works like this:

Bash runs fork(2) to create a copy of itself
Forked bash process sets it's stdout to diroutput.log using something like freopen(3)
Forked bash process runs execve(2) or a similar exec function to replace itself with ls which now uses the stdout setup by bash

But that's just my educated guess.

Best Answer

I was able to figure it out using strace -f and writing a small proof of concept in C.

It appears that bash just manipulates file descriptors in the child process before calling execve as I thought.

Here's how ls -la > diroutput.log works (roughly):

bash calls fork(2)
forked bash process sees the output redirection and opens the file diroutput.log using open(2).
forked bash process replaces the stdout file descriptor using the dup2(2) syscall
bash calls execve(2) to replace it's executable image with ls which then inherits the already setup stdout

The relevant syscalls look like this (strace output):

6924  open("diroutput.log", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 
6924  dup2(3, 1)                        = 1 
6924  close(3)                          = 0 
6924  execve("/bin/ls", ["ls", "-la"], [/* 77 vars */]) = 0

Related Solutions

What sets a child’s STDERR, STDOUT, and STDIN

Stdin, stdout and stderr are inherited from the parent process. It's up to the child process to change them to point to new files if that is needed.

From the fork(2) man page:

   *  The  child inherits copies of the parent's set of open file descrip‐
      tors.  Each file descriptor in the child refers  to  the  same  open
      file  description (see open(2)) as the corresponding file descriptor
      in the parent.

Bash – Prevent a shell fork from living longer than its initiator

This kills the background process before the script exits:

trap '[ "$pid" ] && kill "$pid"' EXIT

function repeat {
    while :; do
        echo repeating; sleep 1
    done
}
repeat &
pid=$!
echo running once

How it works

trap '[ "$pid" ] && kill "$pid"' EXIT

This creates a trap. Whenever the script is about to exit, the commands in single-quotes will be run. That command checks to see if the shell variable pid has been assigned a non-empty value. If it has, then the process associated with pid is killed.
pid=$!

This saves the process id of the preceding background command (repeat &) in the shell variable pid.

Improvement

As Patrick points out in the comments, there is a chance that the script could be killed after the background process starts but before the pid variable is set. We can handle that case with this code:

my_exit() {
    [ "$racing" ] && pid=$!
    [ "$pid" ] && kill "$pid"
}
trap my_exit EXIT

function repeat {
    while :; do
        echo repeating; sleep 1
    done
}

racing=Y
repeat &
pid=$!
racing=

echo running once