Bash Scripting – $! Not Set to PID in Process Substitution

bashscriptingshell-script

Bash 4.4.19(1)-release

I have below a simple script which is the basis for a logging app.
For various reasons I had to use process substitution.

The runner is the heart of the app and since process substitution is asynchronous, I have managed to get it to a good degree of coherence by the while loop. It works perfectly.

Unfortunately I found a case where it will not work: when I execute 'bash <filename> <function>'

So we need 2 files to reproduce.

Requirement:

  1. Why does this happen?
  2. How to modify my while loop to accommodate similar cases?

Simplified script is:

test.sh

#!/bin/bash

2sub() {
local in=$(cat); echo -e "$in";
}   
runner () {
 "${@}" 1> >(2sub)
 while [ -e /proc/$! ]; do sleep 0.1; done     # <<< LOOP WAIT FOR $!
}
remotesub() {
 bash ./test2.sh remotesub2
}

echo -e "running\n"; 
    runner bash ./test2.sh remotesub2 # LOOPS
    # runner remotesub # A POSSIBLE BYPASS/SOLUTION? But why?
echo -e "done!\n"

test2.sh

     remotesub2() {
         echo -e "'${BASH_VERSION}'"
         return 0
     }

     "$@"

Bypass:

As you can see from the script, there is a bypass for the problem, by including bash <filename> <function> inside a function, and passing the function to the runner. Why this works and not the direct way, I am sure somebody here knows.

Please shed some light on this issue and if there are some better ways to do the waiting loop in order to cover these cases.

Solution:

The best solution is what mosvy suggested. Thank you.
Using { "${@}"; } removes the need to package the commands in separate small functions which is a pain. Also after many hours of testing with my larger code, I came to the conclusion that careful killing of sub-processes makes this while [ -e /proc/$! ]; do sleep 0.1; done unnecessary. That line was replaced with wait $!;

Best Answer

If I understand you exactly, you're wondering why $! will be set to the PID of a process run inside >(...) only when that is part of the command line of a built-in command or function, but not when it's part of the command line of an external command.

Simplified example:

$ bash -c 'true > >(echo in=$BASHPID; sleep .1); echo psubst=$!'
psubst=12392
in=12392

$ bash -c '/bin/true > >(echo in=$BASHPID; sleep .1); echo psubst=$!'
in=12751
psubst=

That happens because in the case where an external command is used bash will fork a separate process to run it in, and the process running inside the >(...) will be run as a child of that process, and so as a grandchild of your script, completely outside of its control.

By the time the external command terminates, its child (if still running) will be adopted by pid 1 (init), and so any link that could still be used to retrieve its PID from your script is broken.

A workaround may be to use a wrapper function which will cause all the process substitutions from its command line to be run as children of your script, so their PIDs could be retrieved via pgrep -P "$$".

Also, putting the external command in {...} block and redirecting the ouput of the block also seems to work:

$ bash -c 'func(){ /bin/true; }; func > >(echo in=$BASHPID; sleep .1); echo psubst=$!'
in=3574
psubst=3574
$  bash -c '{ /bin/true; } > >(echo in=$BASHPID; sleep .1); echo psubst=$!'
in=3435
psubst=3435

Both workarounds rely on the way the current implementation works; eg. bash may decide one day to optimize away trivial group commands or functions, breaking these assumptions.

Notice that $! being set to the PID from the last process substitution is an undocumented feature, which also does not work in other shells than bash.

Related Question