Does a process invoking oom-killer kill itself

killprocess-managementsyslog

Looking through syslog, I see lines like dd invoked oom-killer.
Does this mean dd is being killed by the oom-killer or does it mean dd asked oom-killer to go kill another high memory process?

Best Answer

dd triggered OOM killer, which, in turn, killed a process with the highest OOM score.

Related Solutions

Process Management – How Does Linux ‘Kill’ a Process?

Sending kill -9 to a process doesn't require the process' cooperation (like handling a signal), it just kills it off.

You're presuming that because some signals can be caught and ignored they all involve cooperation. But as per man 2 signal, "the signals SIGKILL and SIGSTOP cannot be caught or ignored". SIGTERM can be caught, which is why plain kill is not always effective – generally this means something in the process's handler has gone awry.¹

If a process doesn't (or can't) define a handler for a given signal, the kernel performs a default action. In the case of SIGTERM and SIGKILL, this is to terminate the process (unless its PID is 1; the kernel will not terminate init)² meaning its file handles are closed, its memory returned to the system pool, its parent receives SIGCHILD, its orphan children are inherited by init, etc., just as if it had called exit (see man 2 exit). The process no longer exists – unless it ends up as a zombie, in which case it is still listed in the kernel's process table with some information; that happens when its parent does not wait and deal with this information properly. However, zombie processes no longer have any memory allocated to them and hence cannot continue to execute.

Is there something like a global table in memory where Linux keeps references to all resources taken up by a process and when I "kill" a process Linux simply goes through that table and frees the resources one by one?

I think that's accurate enough. Physical memory is tracked by page (one page usually equalling a 4 KB chunk) and those pages are taken from and returned to a global pool. It's a little more complicated in that some freed pages are cached in case the data they contain is required again (that is, data which was read from a still existing file).

Manpages talk about "signals" but surely that's just an abstraction.

Sure, all signals are an abstraction. They're conceptual, just like "processes". I'm playing semantics a bit, but if you mean SIGKILL is qualitatively different than SIGTERM, then yes and no. Yes in the sense that it can't be caught, but no in the sense that they are both signals. By analogy, an apple is not an orange but apples and oranges are, according to a preconceived definition, both fruit. SIGKILL seems more abstract since you can't catch it, but it is still a signal. Here's an example of SIGTERM handling, I'm sure you've seen these before:

#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <string.h>

void sighandler (int signum, siginfo_t *info, void *context) {
    fprintf (
        stderr,
        "Received %d from pid %u, uid %u.\n",
        info->si_signo,
        info->si_pid,
        info->si_uid
    );
}

int main (void) {
    struct sigaction sa;
    memset(&sa, 0, sizeof(sa));
    sa.sa_sigaction = sighandler;
    sa.sa_flags = SA_SIGINFO;
    sigaction(SIGTERM, &sa, NULL);
    while (1) sleep(10);
    return 0;
}

This process will just sleep forever. You can run it in a terminal and send it SIGTERM with kill. It spits out stuff like:

Received 15 from pid 25331, uid 1066.

1066 is my UID. The PID will be that of the shell from which kill is executed, or the PID of kill if you fork it (kill 25309 & echo $?).

Again, there's no point in setting a handler for SIGKILL because it won't work.³ If I kill -9 25309 the process will terminate. But that's still a signal; the kernel has the information about who sent the signal, what kind of signal it is, etc.

^{1. If you haven't looked at the list of possible signals, see kill -l.}

^{2. Another exception, as Tim Post mentions below, applies to processes in uninterruptible sleep. These can't be woken up until the underlying issue is resolved, and so have ALL signals (including SIGKILL) deferred for the duration. A process can't create that situation on purpose, however.}

^{3. This doesn't mean using kill -9 is a better thing to do in practice. My example handler is a bad one in the sense that it doesn't lead to exit(). The real purpose of a SIGTERM handler is to give the process a chance to do things like clean up temporary files, then exit voluntarily. If you use kill -9, it doesn't get this chance, so only do that if the "exit voluntarily" part seems to have failed.}

Process – How to Kill a Process That Keeps Restarting

Starts automatically with another process ID means that it is a different process. Thus there is a parent process, which monitors its children, and if one dies, it gets respawned by the parent. If you want to stop the service completely, find out how to stop the parent process. Killing it with SIGKILL is of course one of the options, but probably not The Right One^TM, since the service monitor might need to do some cleanup to shut down properly.

To find the monitor process, you might need to inspect the whole process list, since the actual listeners might dissociate themselves from their parent (usually by the fork() + setsid() combo). In this case, I find the output of ps faux (from procps at least, might vary for other implementations) rather handy - it lists all processes in a hierarchical tree. Unless there has been a PID wrap (see also wikipedia), the monitor PID should be smaller than PID of any of the listeners (unless of course you hit a PID-wraparound).

Best Answer

Related Solutions

Process Management – How Does Linux ‘Kill’ a Process?

Process – How to Kill a Process That Keeps Restarting

Related Question