Shell – Why did the trap not trigger

shellsignalstrap:

Given a script that echo's upon receiving a SIGSTOP or SIGHUP signal:

$cat test.sh 
function clean_up {
    echo "cleaning up!"
}

echo 'starting!'

trap clean_up SIGSTOP SIGHUP

sleep 100

I ran it in the background:

$./test.sh > output &
[4] 42624

Then, I killed it with -1, i.e. SIGHUP

$kill -1 42624

But the trap did not work as I expected, i.e. the output file has no cleaning up! in its content.

$cat output 
starting!

What happened?

Best Answer

Actually it did print, but you have to wait 100 seconds to see the result.

You said:

$./test.sh > output &
[4] 42624

If you check ps aux, you will get something similar to:

xiaobai  42624  0.0  0.1 118788  5492 pts/3    S    04:07   0:00 /bin/bash
xiaobai  42626  0.0  0.0 108192   668 pts/3    S    04:07   0:00 sleep 100

Your main script process of /bin/bash, PID 42624 still waiting for sleep 100 finish. Even main process received signal 1, it have to wait sleep 100 finished first, after that is its turn to perform its task, i.e. echo "cleaning up!".

You can make sleep process become background process. In this case, script process can perform echo "cleaning up!" without waiting the sleep process, if and only if script process doesn't exit yet.

We can proved the conceptual(i.e. script is waiting sleep before handle signal) via this script:

function clean_up {
    echo "cleaning up!"
}

echo 'starting!'

trap clean_up SIGHUP
sleep 5000 &
echo 1
sleep 20
echo 2
sleep 10
echo 3

Run this script by ./test.sh > output as usual, then use ps aux to figure out the /bin/bash PID is 23311, then do kill -1 23311. In the same time cat output to know the stage, i.e. when the script go into sleep 20 after echo 1, send kill -1 23311, now wait for 2 come out, you will notice "cleaning up" has been print together before 2. Finally is the turn of echo 3 and the script exit.

$ cat output                                                                                            
starting!
1
cleaning up!
2
3
$ 

The experiment above proved that script receive the signal SIGHUP and perform the signal handler after all previous foreground process is done and to its turn, without waiting the background process.

The story become interesting, with your original script, what if you do kill -1 PID_of_sleep_100 ?

It will exit all processes without print, because the sleep process doesn't return SIGHUP to script process.

So there's a solution to your task, you can do kill -1 PPID(script process) to acknowledge SIGHUP to script process, then kill -1 PID(sleep process). Now script process will do SIGHUP handler before exit, happy hacking :)

Not all signals is same, e.g. SIGKILL is not allow to trap. If you kill -9 the script process, then sleep process will still running and its parent PID will become 1 (check by ps -ef).

[UPDATE]:

Normally kill -N script_PID will kill directly the script if the signal N doesn't trap in your script. But be careful about SIGINT, kill -2 script_PID will not directly kill even though your script doesn't trap it. Your script will wait your child process(e.g. sleep 10) done. Now assume you do multiple kill in script_PID, i.e. kill -2 script_PID AND kill -user-defined_trap_N script_PID, then:

  1. If sleep return normally by after wait for 10 seconds OR kill by signal other than 2, your script will ignore cached signal 2 when return, then perform the user-defined_trap_N function.
  2. But if sleep kill by signal 2, then your script will perform the builtin SIGINT handler when return, then kill directly without perform user-defined_trap_N.
Related Question