Linux – What exactly determines if a backgrounded job is killed when the shell is exited, or killed

This question has come up quite a lot (really a lot), but I'm finding the answers to be generally incomplete. The general question is "Why does/doesn't my job get killed when I exit/kill ssh?", and here's what I've found. The first question is: How general is the following information? The following seems to be true for modern Debian linux, but I am missing some bits; and what do others need to know?

All child processes, backgrounded or not of a shell opened over an ssh connection are killed with SIGHUP when the ssh connection is closed only if the huponexit option is set: run shopt huponexit to see if this is true.
If huponexit is true, then you can use nohup or disown to dissociate the process from the shell so it does not get killed when you exit. Or, run things with screen.
If huponexit is false, which is the default on at least some linuxes these days, then backgrounded jobs will not be killed on normal logout.
But even if huponexit is false, then if the ssh connection gets killed, or drops (different than normal logout), then backgrounded processes will still get killed. This can be avoided by disown or nohup as in (2).
There is some distinction between (a) processes whose parent process is the terminal and (b) processes that have stdin, stdout, or stderr connected to the terminal. I don't know what happens to processes that are (a) and not (b), or vice versa.

Final question: How can I avoid behavior (3)? In other words, by default in Debian backgrounded processes run along merrily by themselves after logout but not after the ssh connection is killed. I'd like the same thing to happen to processes regardless of whether the connection was closed normally or killed. Or, is this a bad idea?

Edit: Another, important way to keep jobs of being killed, that works (?) in either case is to run them through screen. But, the question is more about understanding when things get killed and when they don't: sometimes people want the jobs to be killed on logout, for instance.

More threads:
– Clarification on signals (sighup), jobs, and the controlling terminal
– https://serverfault.com/questions/117152/do-background-processes-get-a-sighup-when-logging-off
– Continue SSH background task/jobs when closing SSH
– Will a job put in background continue running after an SSH session is closed?
– Prevent an already running background process from being stopped after closing SSH client
– How can I start a process over SSH such that it will continue to run after I disconnect?
– Unable to keep remote job running on OS X
– Close SSH connection

Best Answer

A process isn't "killed with SIGHUP" -- at least, not in the strict sense of the word. Rather, when the connection is dropped, the terminal's controlling process (in this case, Bash) is sent a hang-up signal*, which is commonly abbreviated the "HUP signal", or just SIGHUP.

Now, when a process receives a signal, it can handle it any way it wants**. The default for most signals (including HUP) is to exit immediately. However, the program is free to ignore the signal instead, or even to run some kind of signal handler function.

Bash chooses the last option. Its HUP signal handler checks to see if the "huponexit" option is true, and if so, sends SIGHUP to each of its child processes. Only once its finished with that does Bash exit.

Likewise, each child process is free to do whatever it wants when it receives the signal: leave it set to the default (i.e. die immediately), ignore it, or run a signal handler.

Nohup only changes the default action for the child process to "ignore". Once the child process is running, however, it's free change its own response to the signal.

This, I think, is why some programs die even though you ran them with nohup:

Nohup sets the default action to "ignore".
The program needs to do some kind of cleanup when it exits, so it installs a SIGHUP handler, incidentally overwriting the "ignore" flag.
When the SIGHUP arrives, the handler runs, cleaning up the program's data files (or whatever needed to be done) and exits the program.
The user doesn't know or care about the handler or cleanup, and just sees that the program exited despite nohup.

This is where "disown" comes in. A process that's been disowned by Bash is never sent the HUP signal, regardless of the huponexit option. So even if the program sets up its own signal handler, the signal is never actually sent, so the handler never runs. Note, however, that if the program tries to display some text to a user that's logged out, it will cause an I/O error, which could cause the program to exit anyway.

* And, yes, before you ask, the "hang-up" terminology is left over from UNIX's dialup mainframe days.

** Most signals, anyway. SIGKILL, for instance, always causes the program to terminate immediately, period.

Best Answer

Related Solutions

Unable to keep remote job running on OS X

Related Question