My server program received a SIGTERM and stopped (with exit code 0). I am surprised by this, as I am pretty sure that there was plenty of memory for it. Under what conditions does linux (busybox) send a SIGTERM to a process?
Linux – When Does the System Send a SIGTERM to a Process?
busyboxlinuxprocesssignals
Related Solutions
Processes can call the _exit()
system call (on Linux, see also exit_group()
) with an integer argument to report an exit code to their parent. Though it's an integer, only the 8 least significant bits are available to the parent (exception to that is when using waitid()
or handler on SIGCHLD in the parent to retrieve that code, though not on Linux).
The parent will typically do a wait()
or waitpid()
to get the status of their child as an integer (though waitid()
with somewhat different semantics can be used as well).
On Linux and most Unices, if the process terminated normally, bits 8 to 15 of that status number will contain the exit code as passed to exit()
. If not, then the 7 least significant bits (0 to 6) will contain the signal number and bit 7 will be set if a core was dumped.
perl
's $?
for instance contains that number as set by waitpid()
:
$ perl -e 'system q(kill $$); printf "%04x\n", $?'
000f # killed by signal 15
$ perl -e 'system q(kill -ILL $$); printf "%04x\n", $?'
0084 # killed by signal 4 and core dumped
$ perl -e 'system q(exit $((0xabc))); printf "%04x\n", $?'
bc00 # terminated normally, 0xbc the lowest 8 bits of the status
Bourne-like shells also make the exit status of the last run command in their own $?
variable. However, it does not contain directly the number returned by waitpid()
, but a transformation on it, and it's different between shells.
What's common between all shells is that $?
contains the lowest 8 bits of the exit code (the number passed to exit()
) if the process terminated normally.
Where it differs is when the process is terminated by a signal. In all cases, and that's required by POSIX, the number will be greater than 128. POSIX doesn't specify what the value may be. In practice though, in all Bourne-like shells that I know, the lowest 7 bits of $?
will contain the signal number. But, where n
is the signal number,
in ash, zsh, pdksh, bash, the Bourne shell,
$?
is128 + n
. What that means is that in those shells, if you get a$?
of129
, you don't know whether it's because the process exited withexit(129)
or whether it was killed by the signal1
(HUP
on most systems). But the rationale is that shells, when they do exit themselves, by default return the exit status of the last exited command. By making sure$?
is never greater than 255, that allows to have a consistent exit status:$ bash -c 'sh -c "kill \$\$"; printf "%x\n" "$?"' bash: line 1: 16720 Terminated sh -c "kill \$\$" 8f # 128 + 15 $ bash -c 'sh -c "kill \$\$"; exit'; printf '%x\n' "$?" bash: line 1: 16726 Terminated sh -c "kill \$\$" 8f # here that 0x8f is from a exit(143) done by bash. Though it's # not from a killed process, that does tell us that probably # something was killed by a SIGTERM
ksh93
,$?
is256 + n
. That means that from a value of$?
you can differentiate between a killed and non-killed process. Newer versions ofksh
, upon exit, if$?
was greater than 255, kills itself with the same signal in order to be able to report the same exit status to its parent. While that sounds like a good idea, that means thatksh
will generate an extra core dump (potentially overwriting the other one) if the process was killed by a core generating signal:$ ksh -c 'sh -c "kill \$\$"; printf "%x\n" "$?"' ksh: 16828: Terminated 10f # 256 + 15 $ ksh -c 'sh -c "kill -ILL \$\$"; exit'; printf '%x\n' "$?" ksh: 16816: Illegal instruction(coredump) Illegal instruction(coredump) 104 # 256 + 15, ksh did indeed kill itself so as to report the same # exit status as sh. Older versions of `ksh93` would have returned # 4 instead.
Where you could even say there's a bug is that
ksh93
kills itself even if$?
comes from areturn 257
done by a function:$ ksh -c 'f() { return "$1"; }; f 257; exit' zsh: hangup ksh -c 'f() { return "$1"; }; f 257; exit' # ksh kills itself with a SIGHUP so as to report a 257 exit status # to its parent
yash
.yash
offers a compromise. It returns256 + 128 + n
. That means we can also differentiate between a killed process and one that terminated properly. And upon exiting, it will report128 + n
without having to suicide itself and the side effects it can have.$ yash -c 'sh -c "kill \$\$"; printf "%x\n" "$?"' 18f # 256 + 128 + 15 $ yash -c 'sh -c "kill \$\$"; exit'; printf '%x\n' "$?" 8f # that's from a exit(143), yash was not killed
To get the signal from the value of $?
, the portable way is to use kill -l
:
$ /bin/kill 0
Terminated
$ kill -l "$?"
TERM
(for portability, you should never use signal numbers, only signal names)
On the non-Bourne fronts:
csh
/tcsh
andfish
same as the Bourne shell except that the status is in$status
instead of$?
(note thatzsh
also sets$status
for compatibility withcsh
(in addition to$?
)).rc
: the exit status is in$status
as well, but when killed by a signal, that variable contains the name of the signal (likesigterm
orsigill+core
if a core was generated) instead of a number, which is yet another proof of the good design of that shell.es
. the exit status is not a variable. If you care for it, you run the command as:status = <={cmd}
which will return a number or
sigterm
orsigsegv+core
like inrc
.
Maybe for completeness, we should mention zsh
's $pipestatus
and bash
's $PIPESTATUS
arrays that contain the exit status of the components of the last pipeline.
And also for completeness, when it comes to shell functions and sourced files, by default functions return with the exit status of the last command run, but can also set a return status explicitly with the return
builtin. And we see some differences here:
bash
andmksh
(since R41, a regression^Wchange apparently introduced intentionally) will truncate the number (positive or negative) to 8 bits. So for instancereturn 1234
will set$?
to210
,return -- -1
will set$?
to 255.zsh
andpdksh
(and derivatives other thanmksh
) allow any signed 32 bit decimal integer (-231 to 231-1) (and truncate the number to 32bits).ash
andyash
allow any positive integer from 0 to 231-1 and return an error for any number out of that.ksh93
forreturn 0
toreturn 320
set$?
as is, but for anything else, truncate to 8 bits. Beware as already mentioned that returning a number between 256 and 320 could causeksh
to kill itself upon exit.rc
andes
allow returning anything even lists.
Also note that some shells also use special values of $?
/$status
to report some error conditions that are not the exit status of a process, like 127
or 126
for command not found or not executable (or syntax error in a sourced file)...
To answer that question, you have to understand how signals are sent to a process and how a process exists in the kernel.
Each process is represented as a task_struct
inside the kernel (the definition is in the sched.h
header file and begins here). That struct holds information about the process; for instance the pid. The important information is in line 1566 where the associated signal is stored. This is set only if a signal is sent to the process.
A dead process or a zombie process still has a task_struct
. The struct remains, until the parent process (natural or by adoption) has called wait()
after receiving SIGCHLD
to reap its child process. When a signal is sent, the signal_struct
is set. It doesn't matter if the signal is a catchable one or not, in this case.
Signals are evaluated every time when the process runs. Or to be exact, before the process would run. The process is then in the TASK_RUNNING
state. The kernel runs the schedule()
routine which determines the next running process according to its scheduling algorithm. Assuming this process is the next running process, the value of the signal_struct
is evaluated, whether there is a waiting signal to be handled or not. If a signal handler is manually defined (via signal()
or sigaction()
), the registered function is executed, if not the signal's default action is executed. The default action depends on the signal being sent.
For instance, the SIGSTOP
signal's default handler will change the current process's state to TASK_STOPPED
and then run schedule()
to select a new process to run. Notice, SIGSTOP
is not catchable (like SIGKILL
), therefore there is no possibility to register a manual signal handler. In case of an uncatchable signal, the default action will always be executed.
To your question:
A defunct or dead process will never be determined by the scheduler to be in the TASK_RUNNING
state again. Thus the kernel will never run the signal handler (default or defined) for the corresponding signal, whichever signal is was. Therefore the exit_signal
will never be set again. The signal is "delivered" to the process by setting the signal_struct
in task_struct
of the process, but nothing else will happen, because the process will never run again. There is no code to run, all that remains of the process is that process struct.
However, if the parent process reaps its children by wait()
, the exit code it receives is the one when the process "initially" died. It doesn't matter if there is a signal waiting to be handled.
Best Answer
I'll post this as an answer so that there's some kind of resolution if this turns out to be the issue.
An exit status of 0 means a normal exit from a successful program. An exiting program can choose any integer between 0 and 255 as its exit status. Conventionally, programs use small values. Values 126 and above are used by the shell to report special conditions, so it's best to avoid them.
At the C API level, programs report a 16-bit status¹ that encodes both the program's exit status and the signal that killed it, if any.
In the shell, a command's exit status (saved in
$?
) conflates the actual exit status of the program and the signal value: if a program is killed by a signal,$?
is set to a value greater than 128 (with most shells, this value is 128 plus the signal number; ATT ksh uses 256 + signal number and yash uses 384 + signal number, which avoids the ambiguity, but the other shells haven't followed suit).In particular, if
$?
is 0, your program exited normally.Note that this includes the case of a process that receives SIGTERM, but has a signal handler for it, and eventually exits normally (perhaps as an indirect consequence of the SIGTERM signal, perhaps not).
To answer the question in your title, SIGTERM is never sent automatically by the system. There are a few signals that are sent automatically like SIGHUP when a terminal goes away, SIGSEGV/SIGBUS/SIGILL when a process does things it shouldn't be doing, SIGPIPE when it writes to a broken pipe/socket, etc. And there are a few signals that are sent due to a key press in a terminal, mainly SIGINT for Ctrl+C, SIGQUIT for Ctrl+\ and SIGTSTP for Ctrl+Z, but SIGTERM is not one of those. If a process receives SIGTERM, some other process sent that signal.
¹ roughly speaking