Bash subshell creation with curly braces

bashsubshell

According to this, placing a list of commands between curly braces causes the list to be executed in the current shell context. No subshell is created.

Using ps to see this in action

This is the process hierarchy for a process pipeline executed directly on command line. 4398 is the PID for the login shell:

sleep 2 | ps -H;
  PID TTY          TIME CMD
   4398 pts/23   00:00:00 bash
   29696 pts/23   00:00:00   sleep
   29697 pts/23   00:00:00   ps

Now follows the process hierarchy for a process pipeline between curly braces executed directly on command line. 4398 is the PID for the login shell. It's similar to the hierarchy above proving that everything is executed in current shell context:

{ sleep 2 | ps -H; }
   PID TTY          TIME CMD
    4398 pts/23   00:00:00 bash
    29588 pts/23   00:00:00   sleep
    29589 pts/23   00:00:00   ps

Now, this is the process hierarchy when the sleep in the pipeline is itself placed inside curly braces (so two levels of braces in all)

{ { sleep 2; } | ps -H; }
  PID TTY          TIME CMD
   4398 pts/23   00:00:00 bash
   29869 pts/23   00:00:00   bash
   29871 pts/23   00:00:00     sleep
   29870 pts/23   00:00:00   ps

Why does bash have to create a subshell to run sleep in the 3rd case when the documentation states that commands between curly braces are executed in current shell context?

Best Answer

In a pipeline, all commands run concurrently (with their stdout/stdin connected by pipes) so in different processes.

cmd1 | cmd2 | cmd3

All three commands run in different processes, so at least two of them have to run in a child process. Some shells run one of them in the current shell process (if builtin like read or if the pipeline is the last command of the script), but bash runs them all in their own separate process (except with the lastpipe option in recent bash versions and under some specific conditions).

{...} groups commands. If that group is part of a pipeline, it has to run in a separate process just like a simple command.

In:

{ a; b "$?"; } | c

We need a shell to evaluate that a; b "$?" is a separate process, so we need a subshell. The shell could optimise by not forking for b since it's the last command to be run in that group. Some shells do it, but apparently not bash.

Related Solutions

Bash – Rule for Invoking Subshell

The parentheses always start a subshell. What's happening is that bash detects that sleep 5 is the last command executed by that subshell, so it calls exec instead of fork+exec. The sleep command replaces the subshell in the same process.

In other words, the base case is:

( … ) create a subshell. The original process calls fork and wait. In the subprocess, which is a subshell:
1. sleep is an external command which requires a subprocess of the subprocess. The subshell calls fork and wait. In the subsubprocess:
  1. The subsubprocess executes the external command → exec.
  2. Eventually the command terminates → exit.
2. wait completes in the subshell.
wait completes in the original process.

The optimization is:

( … ) create a subshell. The original process calls fork and wait. In the subprocess, which is a subshell until it calls exec:
1. sleep is an external command, and it's the last thing this process needs to do.
2. The subprocess executes the external command → exec.
3. Eventually the command terminates → exit.
wait completes in the original process.

When you add something else after the call the sleep, the subshell needs to be kept around, so this optimization can't happen.

When you add something else before the call to sleep, the optimization could be made (and ksh does it), but bash doesn't do it (it's very conservative with this optimization).

Bash – Why is subshell created by background control operator (&) not displayed under pstree

Until I started answering this question, I hadn’t realised that using the & control operator to run a job in the background starts a subshell. Subshells are created when commands are wrapped in parentheses or form part of a pipeline (each command in a pipeline is executed in its own subshell).

The Lists of Commands section of the Bash manual (thanks jimmij) states:

If a command is terminated by the control operator ‘&’, the shell executes the command asynchronously in a subshell. This is known as executing the command in the background. The shell does not wait for the command to finish, and the return status is 0 (true).

As I understand it, when you run sleep 10 & the shell forks to create a new child process (a copy of itself) and then immediately execs to replace this child process with code from the external command (sleep). This is similar to what happens when a command is run as normal (in the foreground). See the Fork–exec Wikipedia article for a short overview of this mechanism.

I couldn’t understand why Bash would run backgrounded commands in a subshell but it makes sense if you also want to be able to run shell builtins such as exit or echo to be run in the background (not just external commands).

When it’s a shell builtin that’s being run in the background, the fork happens (resulting in a subshell) without an exec call to replace itself with an external command. Running the following commands shows that when the echo command is wrapped in curly braces and run in the background (with the &), a subshell is indeed created:

$ { echo $BASH_SUBSHELL $BASHPID; }
0 21516
$ { echo $BASH_SUBSHELL $BASHPID; } &
[1] 22064
$ 1 22064

In the above example, I wrapped the echo command in curly braces to avoid BASH_SUBSHELL being expanded by the current shell; curly braces are used to group commands together without using a subshell. The second version of the command (ending with the & control operator) clearly demonstrates that terminating the command with the ampersand has resulted in a subshell (with a new PID) being created to execute the echo builtin. (I’m probably simplifying the shell’s behaviour here. See mikeserv’s comment.)

I would never have thought of running exit & and had I not read your question, I would have expected the current shell to quit. Knowing now that such commands are run in a subshell, your explanation that it’s the subshell which exits makes sense.

“Why is subshell created by background control operator (&) not displayed under pstree”

As mentioned above, when you run sleep 10 &, Bash forks itself to create the subshell but since sleep is an external command, it calls the exec() system call which immediately replaces the Bash code and data in the child process with a running copy of the sleep program. By the time you run pstree, the exec call will already have completed and the child process will now have the name “sleep”.

While away from my computer, I tried to think of a way of keeping the subshell running long enough for the subshell to be displayed by pstree. I figured we could run the command through the time builtin:

$ time sleep 11 &
[2] 4502
$ pstree -p 26793
bash(26793)─┬─bash(4502)───sleep(4503)
            └─pstree(4504)

Here, the Bash shell (26793) forks to create a subshell (4502) in order to execute the command in the background. This subshell runs its own time builtin command which, in turn, forks (to create a new process with PID 4503) and execs to run the external sleep command.

Using named pipes, jimmij came up with a clever way to keep the subshell created to run exit alive long enough for it to be displayed by pstree:

$ mkfifo file
$ exit <file &
[2] 6413
$ pstree -p 26793
bash(26793)─┬─bash(6413)
            └─pstree(6414)
$ echo > file
$ jobs
[2]-  Done    exit < file

Redirecting stdin from a named pipe is clever as it causes the subshell to block until it receives input from the named pipe. Later, redirecting the output of echo (without any arguments) writes a newline character to the named pipe which unblocks the subshell process which, in turn, runs the exit builtin command.

Similarly, for the sleep command:

$ mkfifo named_pipe
$ sleep 11 < named_pipe &
[1] 6600
$ pstree -p 26793
bash(26793)─┬─bash(6600)
            └─pstree(6603)

Here we see that the subshell created to run the command in the background has a PID of 6600. Next, we unblock the process by writing a newline character to the pipe:

$ echo > named_pipe

The subshell then execs to run the sleep command.

$ pstree -p 26793
bash(26793)─┬─pstree(6607)
            └─sleep(6600)

After the exec() call, we can see that the child process (6600) is now running the sleep program.

Best Answer

Related Solutions

Bash – Rule for Invoking Subshell

Bash – Why is subshell created by background control operator (&) not displayed under pstree

Related Question