Process Killed Before Launching in Background – Troubleshooting Tips

background-processjob-controlshellterminal-emulatorxfce4-terminal

I'm using a bash script script.sh containing a command cmd, launched in background:

#!/bin/bash
…
cmd &
…

If I open a terminal emulator and run script.sh, cmd is properly executed in background, as expected. That is, while script.sh has ended, cmd continues to run in background, with PPID 1.

But, if I open another terminal emulator (let say xfce4-terminal) from the previous one (or at the beginning of desktop session, which is my real use case), and execute script.sh by

xfce4-terminal -H -x script.sh

cmd is not properly executed anymore: It is killed by the termination of script.sh. Using nohup to prevent this is not sufficient. I am obliged to put a sleep command after it, otherwise cmd is killed by the termination of script.sh, before being dissociated from it.

The only way I found to make cmd properly execute in background is to put set -m in script.sh. Why is it necessary in this case, and not in the first one? Why this difference in behaviour between the two ways of executing script.sh (and hence cmd)?

I assume that, in the first case, monitor mode is not activated, as one can see by putting set -o in script.sh.

Best Answer

The process your cmd is supposed to be run in will be killed by the SIGHUP signal between the fork() and the exec(), and any nohup wrapper or other stuff will have no chance to run and have any effect. (You can check that with strace)

Instead of nohup, you should set SIGHUP to SIG_IGN (ignore) in the parent shell before executing your background command; if a signal handler is set to "ignore" or "default", that disposition will be inherited through fork() and exec(). Example:

#! /bin/sh
trap '' HUP    # ignore SIGHUP
xclock &
trap - HUP     # back to default

Or:

#! /bin/sh
(trap '' HUP; xclock &)

If you run this script with xfce4-terminal -H -x script.sh, the background command (xclock &) will not be killed by the SIGHUP sent when script.sh terminates.

When a session leader (a process that "owns" the controlling terminal, script.sh in your case) terminates, the kernel will send a SIGHUP to all processes from its foreground process group; but set -m will enable job control and commands started with & will be put in a background process group, and they won't signaled by SIGHUP.

If job control is not enabled (the default for a non-interactive script), commands started with & will be run in the same foreground process group, and the "background" mode will be faked by redirecting their input from /dev/null and letting them ignore SIGINT and SIGQUIT.

Processes started this way from a script which once run as a foreground job but which has already exited won't be signaled with SIGHUP either, since their process group (inherited from their dead parent) is no longer the foreground one on the terminal.

Extra notes:

The "hold mode" seems to be different between xterm and xfce4-terminal (and probably other vte-based terminals). While the former will keep the master side of the pty open, the latter will tear it off after the program run with -e or -x has exited, causing any write to the slave side to fail with EIO. xterm will also ignore WM_DELETE_WINDOW messages (ie it won't close) while there are still processes from the foreground process group running.

Related Question