This post is basically a follow-up to an earlier question of mine.
From the answer to that question I realized that not only I don't quite understand the whole concept of a "subshell", but more generally, I don't understand the relationship between fork
-ing and children processes.
I used to think that when process X
executes a fork
, a new process Y
is created whose parent is X
, but according to the answer to that question,
[a] subshell is not a completely new process, but a fork of the existing process.
The implication here is that a "fork" is not (or does not result in) "a completely new process."
I'm now very confused, too confused, in fact, to formulate a coherent question to directly dispel my confusion.
I can however formulate a question that may lead to enlightenment indirectly.
Since, according to zshall(1)
, $ZDOTDIR/.zshenv
gets sourced whenever a new instance of zsh
starts, then any command in $ZDOTDIR/.zshenv
that results in the creation of a "a completely new [zsh] process" would result in an infinite regress. On the other hand, including either of the following lines in a $ZDOTDIR/.zshenv
file does not result in an infinite regress:
echo $(date; printenv; echo $$) > /dev/null #1
(date; printenv; echo $$) #2
The only way I found to induce an infinite regress by the mechanism described above was to include a line like the following1 in the $ZDOTDIR/.zshenv
file:
$SHELL -c 'date; printenv; echo $$' #3
My questions are:
-
what difference between the commands marked
#1
,#2
above and the one marked#3
accounts from this difference in behavior? -
if the shells that get created in
#1
and#2
are called "subshells", what are those like the one generated by#3
called? -
is it possible to rationalize (and maybe generalize) the empirical/anecdotal findings described above in terms of the "theory" (for lack of a better word) of Unix processes?
The motivation for the last question is to be able to determine ahead of time (i.e. without resorting to experimentation) what commands would lead to an infinite regress if they were included in $ZDOTDIR/.zshenv
?
1 The particular sequence of commands date; printenv; echo $$
that I used in the various examples above is not too important. They happen to be commands whose output was potentially helpful towards interpreting the results of my "experiments". (I did, however, want these sequences to consist of more than one command, for the reason explained here.)
Best Answer
If you focus on the word "starts" here you'll have a better time of things. The effect of
fork()
is to create another process that begins from exactly where the current process already is. It's cloning an existing process, with the only difference being the return value offork
. The documentation is using "starts" to mean entering the program from the beginning.Your example #3 runs
$SHELL -c 'date; printenv; echo $$'
, starting an entirely new process from the beginning. It will go through the ordinary startup behaviour. You can illustrate that by, for example, swapping in another shell: runbash -c ' ... '
instead ofzsh -c ' ... '
. There's nothing special about using$SHELL
here.Examples #1 and #2 run subshells. The shell
fork
s itself and executes your commands inside that child process, then carries on with its own execution when the child is done.The answer to your question #1 is the above: example 3 runs an entirely new shell from the start, while the other two run subshells. The startup behaviour includes loading
.zshenv
.The reason they call this behaviour out specifically, which is probably what leads to your confusion, is that this file (unlike some others) loads in both interactive and non-interactive shells.
To your question #2:
If you want a name you could call it a "child shell", but really it's nothing. It's no different than any other process you start from the shell, be it the same shell, a different shell, or
cat
.To your question #3:
fork
makes a new process, with a new PID, that starts running in parallel from exactly where this one left off.exec
replaces the currently-executing code with a new program loaded from somewhere, running from the beginning. When you spawn a new program, you firstfork
yourself and thenexec
that program in the child. That is the fundamental theory of processes that applies everywhere, inside and outside of shells.Subshells are
fork
s, and every non-builtin command you run leads to both afork
and anexec
.Note that
$$
expands to the PID of the parent shell in any POSIX-compatible shell, so you may not be getting the output you expect regardless. Note also that zsh aggressively optimises subshell execution anyway, and commonlyexec
s the last command, or doesn't spawn the subshell at all if all the commands are safe without it.One useful command for testing your intuitions is:
That will print to standard error all process-related events (and no others) for the command
...
you run in a new shell. You can see what does and does not run in a new process, and whereexec
s occur.Another possibly-useful command is
pstree -h
, which will print out and highlight the tree of parent processes of the current process. You can see how many layers deep you are in the output.