Linux – What’s a reliable technique for killing background processes on script termination

bashlinuxprocessshell

I use shell scripts to react to system events and update status displays in my window manager. For example, one script determines the current wifi status by listenining to multiple sources:

  1. associate/dissociate events from wpa_supplicant
  2. address changes from ip (so i know when dhcpcd has assigned an address)
  3. a timer process (so the signal strength updates from time to time)

To achieve the multiplexing, I end up spawning background processes:

{ wpa_cli -p /var/run/wpa_supplicant -i wlan0 -a echo &
ip monitor address &
while sleep 30; do echo; done } |
while read line; do update_wifi_status; done &

ie, the setup is that whenever any of the event sources output a line, my wifi status updates. The entire pipeline is run in the background (the final '&') because I also watch another event source that causes my script to terminate:

wait_for_termination
kill $!

The kill is supposed to clean up the background processes, but in this form it doesn't quite do the job. The 'wpa_cli' and 'ip' processes always survive, at least, and nor do they die on their next event (in theory they should get a SIGPIPE; I guess the reading process must still be alive too).

The question is, how to reliably [and elegantly!] clean up all the background processes spawned?

Best Answer

The super simple solution is to add this at the end of the script:

kill -- -$$

Explanation:

$$ gives us the PID of the running shell. So, kill $$ would send a SIGTERM to the shell process. However, if we negate the PID, kill sends a SIGTERM to every process in the process group. We need the -- beforehand so kill knows that -$$ is a process group ID and not a flag.

Note that this relies on the running shell being a process group leader! Otherwise, $$ (the PID) will not match the process group ID, and you end up sending a signal to who knows where (well, probably nowhere as there is unlikely to be a process group with a matching ID if we're not a group leader).

When the shell starts, it creates a new process group[1]. Every forked process becomes a member of that process group, unless they explicitly change their process group via a syscall (setpgid).

The easiest way to guarantee a particular script runs as a process group leader is to launch it using setsid. For example, I have a few of these status scripts which I launch from a parent script:

#!/bin/sh
wifi_status &
bat_status &

Written like this, both the wifi and battery scripts run with the same process group as the parent script, and kill -- -$$ doesn't work. The fix is:

#!/bin/sh
setsid wifi_status &
setsid bat_status &

I found pstree -p -g useful to visualise process & process group IDs.

Thanks to everyone who contributed and made me dig a little deeper, I learnt stuff! :)

[1] is there other circumstances where the shell creates a process group? eg. on starting a subshell? i don't know...

Related Question