Process Parent-Child – Significance of a Process’s Parent from Child’s Perspective

posixprocessprocess-groups

In POSIX, processes are “related” to each other through two basic hierarchies:

  1. The hierarchy of parent and child processes.

  2. The hierarchy of sessions and process groups.

User processes have a great deal of control over the latter, via setpgid and setsid, but they have very little control over the former—the parent process ID is set when a process is spawned and altered by the kernel when the parent exits (usually to PID 1), but otherwise it does not change. Reflecting on that, I’ve been wondering how important the parent–child relationship really is.

Here’s a summary of my understanding so far:

  • Parent–child relationships are clearly important from the perspective of the parent process, since various syscalls, like wait and setpgid, are only allowed on child processes.

  • The session–group–process relationship is clearly important to all processes, both the session leader and other processes in the session, since syscalls like kill operate on entire process groups, setpgid can only be used to join a group in the same session, and all processes in a session’s foreground process group are sent SIGHUP if the session leader exits.

  • What’s more, the two hierarchies are clearly related from the perspective of the parent, since setsid only affects new children and setpgid can only be used on children, but they seem essentially unrelated from the perspective of the child (since a parent process dying has no impact whatsoever on a process’s group or session).

Conspicuously absent, however, is any reason for a child process to care what its current parent is. Therefore, I have the following question: does the current value of getppid() have any importance whatsoever from the perspective of the child process, besides perhaps identifying whether or not its spawning process has exited?


To put the same question another way, imagine the same program is spawned twice, from the same parent, in two different ways:

  1. The first child is spawned in the usual way, by fork() followed shortly by exec().

  2. The second child is spawned indirectly: the parent process calls fork(), and then the child also calls fork(), and it’s the grandchild process that calls exec(). The immediate child then exits, so the grandchild is orphaned, and its PPID is reassigned to PID 1.

In this hypothetical scenario, assuming all else is equal, do any reasonable programs have any reason to behave any differently? So far, my conclusion seems to be “no,” since the session is left unchanged, as are the process’s inherited file descriptors… but I’m not sure.

Note: I do not consider “acquiring the parent PID to communicate with it” to be a valid answer to that question, since orphaned programs cannot in general rely on their PPID to be set to 1 (some systems set orphaned processes’ PPID to some other value), so the only way to avoid a race condition is to acquire the parent process ID via a call to getpid() before forking, then to use that value in the child.

Best Answer

When I saw this question, I was pretty interested because I know I've seen getppid used before..but I couldn't remember where. So, I turned to one of the projects that I figured has probably used every Linux syscall and then some: systemd. One GitHub search later, and I found two uses that portray some more general use cases (there are a few other uses as well, but they're more specific to systemd):

  • In sd-notify. For some context: systemd needs to know when a service has started so it can proceed to start any that depend on it. This is normally done from a C program via the sd_notify API, which is a way for daemons to tell systemd their status.

    Of course, if you're using a shell script as a service...calling C functions isn't exactly doable. Therefore, systemd comes with the systemd-notify command, which is a small wrapper over the sd_notify API. One problem: systemd also needs to know the PID that is sending the message. For systemd-notify, this would be its own PID, which would be a short-lived process ID that immediately goes away. Not useful.

    You probably already know where I'm headed: getppid is used by systemd-notify to grab the parent process's PID, since that's usually the actual service process. In short, getppid can be used by a short-lived CLI application to send a message on behalf of the parent process.

    Once I found this, another unix tool that might use getppid like this came to mind: polkit, which is a process authentication framework used to gate stuff like sending D-Bus messages or running privileged applications. (At minimum, I'd guess you've seen the GUI password prompts that are displayed by polkit's auth agents.) polkit includes an executable named pkexec that can be used a bit like sudo, except now polkit is used for authorization. Now, polkit needs to know the PID of the process asking for authorization...yeah you get the idea, pkexec uses getppid to find that.

    (While looking at that, I also found out that polkit's TTY auth agent uses it too.)

  • This one's a bit less interesting but still notable: getppid is used to emulate PR_SET_PDEATHSIG if the parent had died by the time that flag was set. (The flag is just a way for a child to be automatically sent a signal like SIGKILL if the parent dies.)