Shell – What should interactive shells do in orphaned process groups

processprocess-groupsshellsignalstty

(Re-posting in unix per the suggestion in https://stackoverflow.com/questions/13718394/what-should-interactive-shells-do-in-orphaned-process-groups)

The short question is, what should a shell do if it is in an orphaned process group that doesn't own the tty? But I recommend reading the long question because it's amusing.

Here is a fun and exciting way to turn your laptop into a portable space heater, using your favorite shell (unless you're one of those tcsh weirdos):

#include <unistd.h>   
int main(void) {
    if (fork() == 0) {
        execl("/bin/bash", "/bin/bash", NULL);
    }
    return 0;
}

This causes bash to peg the CPU at 100%. zsh and fish do the same, while ksh and tcsh mumble something about job control and then keel over, which is a bit better, but not much. Oh, and it's a platform agnostic offender: OS X and Linux are both affected.

My (potentially wrong) explanation is as follows: the child shell detects it is not in the foreground: tcgetpgrp(0) != getpgrp(). Therefore it tries to stop itself: killpg(getpgrp(), SIGTTIN). But its process group is orphaned, because its parent (the C program) was the leader and died, and SIGTTIN sent to an orphaned process group is just dropped (otherwise nothing could start it again). Therefore, the child shell is not stopped, but it's still in the background, so it does it all again, right away. Rinse and repeat.

My question is, how can a command line shell detect this scenario, and what is the right thing for it to do? I have two solutions, neither of which is ideal:

  1. Try to signal the process whose pid matches our group ID. If that fails with ESRCH, it means we're probably orphaned.
  2. Try a non-blocking read of one byte from /dev/tty. If that fails with EIO, it means we're probably orphaned.

(Our issue tracking this is https://github.com/fish-shell/fish-shell/issues/422 )

Thanks for your thoughts!

Best Answer

I agree with your analysis and I agree it sounds like you have to detect whether your process group is orphaned or not.

tcsetattr is also meant to return EIO if the process group is orphaned (and we're not blocking/ignoring SIGTTOU. That might be a less intrusive way than a read on the terminal.

Note that you can reproduce it with:

(bash<&1 &)

You need the redirection otherwise stdin is redirected to /dev/null when running a command in the background.

(bash<&1 & sleep 2)

Gives even weirder behaviour, as because you end up with two shells reading from the terminal. They are ignoring SIGTTIN and the new one is not detecting, once it's started that it is no longer in the foreground process group.

ksh93's solution is not so bad: only go up to 20 times (instead of infinite) through that loop before giving up.