Bash – How to understand the output of tracing running an external command in bash

bashexecforklinuxstrace

In Ubuntu, I run date directly in an interactive bash shell whose pid is 6913.

$ date
Wed Mar  2 23:57:44 EST 2016

At the same time, I trace the bash shell 6913 from another interactive bash shell using strace:

    $ sudo strace -f -e trace=process -p 6913
    Process 6913 attached
    clone(Process 9098 attached
    child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
    child_tidptr=0x7f457c05ca10) = 9098
    [pid  6913] wait4(-1,  <unfinished ...>
    [pid  9098] execve("/bin/date", ["date"], [/* 66 vars */]) = 0
    [pid  9098] arch_prctl(ARCH_SET_FS, 0x7f40d6a4f740) = 0
    [pid  9098] exit_group(0)               = ?
    [pid  9098] +++ exited with 0 +++
    <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED|WCONTINUED, NULL) = 9098
    --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=9098, si_status=0, si_utime=0, si_stime=0} ---
    wait4(-1, 0x7ffea6781518, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)
    clone(Process 9099 attached
    child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f457c05ca10) = 9099
    [pid  9099] clone(Process 9100 attached
    child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f457c05ca10) = 9100
    [pid  9099] wait4(-1,  <unfinished ...>
    [pid  9100] execve("/bin/sed", ["sed", "s:\\([^/]\\)[^/]*/:\\1/:g"], [/* 66 vars */]) = 0
    [pid  9100] arch_prctl(ARCH_SET_FS, 0x7f998bb03840) = 0
    [pid  9100] exit_group(0)               = ?
    [pid  9100] +++ exited with 0 +++
    [pid  9099] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 9100
    [pid  9099] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=9100, si_status=0, si_utime=0, si_stime=0} ---
    [pid  9099] wait4(-1, 0x7ffea6780c58, WNOHANG, NULL) = -1 ECHILD (No child processes)
    [pid  9099] exit_group(0)               = ?
    [pid  9099] +++ exited with 0 +++
    --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=9099, si_status=0, si_utime=0, si_stime=0} ---
    wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 9099
    wait4(-1, 0x7ffea6780f18, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)

It seems to me the output can be divided into two parts:

  1. First the bash shell 6913 clone() itself to create a subprocess
    9098, and then the subprocess 9098 execve() date and exits.

  2. After these, the bash shell 6913 clone() itself to create a
    subprocess 9099, then the subprocess 9099 clone() itself to create
    a subsubprocess 9100, and then the subsubprocess 9100 execve() sed. My question is about this second part:

    • Do the last two clone() and the last execve() belong to the action that handles SIGCHLD received by the bash shell 6913?
      What does the action do?

    • Why does process 9100 execve() sed? What does sed do here?

    • Why isn't process 9099 the process that execve() sed? Why does 9099 clone() to create 9100, and then 9100 execve() sed? In other words, why do we need two sequential clones 9099 and 9100, rather than just one clone 9099?


To answer the comment:

$ echo $PROMPT_COMMAND
pwd2=$(sed "s:\([^/]\)[^/]*/:\1/:g" <<<$PWD)
$ echo $PS1
\u@\h:$pwd2\$

After running unset PROMPT_COMMAND in shel 6913, the tracing output is

$ sudo strace -f -e trace=process -p 6913
[sudo] password for t: 
Process 6913 attached
clone(Process 12918 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f457c05ca10) = 12918
[pid  6913] wait4(-1,  <unfinished ...>
[pid 12918] execve("/bin/date", ["date"], [/* 66 vars */]) = 0
[pid 12918] arch_prctl(ARCH_SET_FS, 0x7ff00c632740) = 0
[pid 12918] exit_group(0)               = ?
[pid 12918] +++ exited with 0 +++
<... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED|WCONTINUED, NULL) = 12918
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12918, si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, 0x7ffea6781518, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)

So the first two questions above are now answered. I am still not sure about the third question.

Best Answer

clone(Process 9099 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f457c05ca10) = 9099
[pid  9099] clone(Process 9100 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f457c05ca10) = 9100
[pid  9099] wait4(-1,  <unfinished ...>
[pid  9100] execve("/bin/sed", ["sed", "s:\\([^/]\\)[^/]*/:\\1/:g"], [/* 66 vars */]) = 0

The two forks (on modern Linux systems, forks are done by using the clone system call) are due to bash evaluating your PROMPT_COMMAND variable, which you said is

pwd2=$(sed "s:\([^/]\)[^/]*/:\1/:g" <<<$PWD)

The forks are not directly related to the SIGCHLD signal previously received.

The bash manual says:

The value of the variable PROMPT_COMMAND is examined just before Bash prints each primary prompt. If PROMPT_COMMAND is set and has a non-null value, then the value is executed just as if it had been typed on the command line.

Internally, bash ultimately calls parse_and_execute to evaluate the contents of PROMPT_COMMAND. Bash tries to minimize the number of forks it needs to do. For simple statements such as pwd2=$PWD, no forks are needed. For more complex statements, one or more forks may be done. In your case, $( ... ) results in a fork, pid 9099, of the shell, which will then evaluate the command between parentheses. The call to the non-builtin utility sed results in another fork, pid 9100, followed by an execve of /bin/sed.

What does sed do here?

It appears to chop off all but the first character of the name of each directory above the current working directory.

Related Question