Great classic question about managing jobs and signals with good examples! I've developed a stripped down test script to focus on the mechanics of the signal handling.
To accomplish this, after starting the children (loop.sh) in the background, call wait
, and upon receipt of the INT signal, kill
the process group whose PGID equals your PID.
For the script in question, play.sh
, this can be accomplished by the following:
In the stop()
function replace exit 1
with
kill -TERM -$$ # note the dash, negative PID, kills the process group
Start loop.sh
as a background process (multiple background processes can be started here and managed by play.sh
)
loop.sh &
Add wait
at the end of the script to wait for all children.
wait
When your script starts a process, that child becomes a member of a process group with PGID equal to the PID of the parent process which is $$
in the parent shell.
For example, the script trap.sh
started three sleep
processes in the background and is now wait
ing on them, notice the process group ID column (PGID) is the same as the PID of the parent process:
PID PGID STAT COMMAND
17121 17121 T sh trap.sh
17122 17121 T sleep 600
17123 17121 T sleep 600
17124 17121 T sleep 600
In Unix and Linux you can send a signal to every process in that process group by calling kill
with the negative value of the PGID
. If you give kill
a negative number, it will be used as -PGID. Since the script's PID ($$
) is the same as it's PGID, you can kill
your process group in the shell with
kill -TERM -$$ # note the dash before $$
you have to give a signal number or name, otherwise some implementations of kill will tell you "Illegal option" or "invalid signal specification."
The simple code below illustrates all of this. It sets a trap
signal handler, spawns 3 children, then goes into an endless wait loop, waiting to kill itself by the kill
process group command in the signal handler.
$ cat trap.sh
#!/bin/sh
signal_handler() {
echo
read -p 'Interrupt: ignore? (y/n) [Y] >' answer
case $answer in
[nN])
kill -TERM -$$ # negative PID, kill process group
;;
esac
}
trap signal_handler INT
for i in 1 2 3
do
sleep 600 &
done
wait # don't exit until process group is killed or all children die
Here's a sample run:
$ ps -o pid,pgid,stat,args
PID PGID STAT COMMAND
8073 8073 Ss /bin/bash
17111 17111 R+ ps -o pid,pgid,stat,args
$
OK no extra processes running. Start the test script, interrupt it (^C
), choose to ignore the interrupt, and then suspend it (^Z
):
$ sh trap.sh
^C
Interrupt: ignore? (y/n) [Y] >y
^Z
[1]+ Stopped sh trap.sh
$
Check the running processes, note the process group numbers (PGID
):
$ ps -o pid,pgid,stat,args
PID PGID STAT COMMAND
8073 8073 Ss /bin/bash
17121 17121 T sh trap.sh
17122 17121 T sleep 600
17123 17121 T sleep 600
17124 17121 T sleep 600
17143 17143 R+ ps -o pid,pgid,stat,args
$
Bring our test script to the foreground (fg
) and interrupt (^C
) again, this time choose not to ignore:
$ fg
sh trap.sh
^C
Interrupt: ignore? (y/n) [Y] >n
Terminated
$
Check running processes, no more sleeping:
$ ps -o pid,pgid,stat,args
PID PGID STAT COMMAND
8073 8073 Ss /bin/bash
17159 17159 R+ ps -o pid,pgid,stat,args
$
Note about your shell:
I had to modify your code to get it to run on my system. You have #!/bin/sh
as the first line in your scripts, yet the scripts use extensions (from bash or zsh) which are not available in /bin/sh.
Best Answer
Since new processes all belong to the same process group, that of the parent process, have a process start a bunch of processes (
fork
), and then with appropriate logging and a delay, type Ctrl+C. They all eat aSIGINT
.(Add
strace
orsysdig
or such to see the system calls or signals involved.)