I have a script to kill a process tree that works fine in linux but I am experiencing some odd behaviour on osx. It actually works fine with my unit tests and also when trying to run it manually on osx, but for some reason when it runs as a jenkins job it acts differently.
So this is the current bash function with a bit of debug echo and sleeps:
killtree() {
local _pid=$1
local _sig=${2:--TERM}
echo "Stopping ${_pid}"
sleep 1
kill -stop ${_pid} # stop parent to avoid creation of new children
children=`pgrep -P ${_pid}`
echo "Children=$children"
sleep 1
for _child in $children; do
killtree ${_child} ${_sig}
done
echo "Killing child ${_pid}"
sleep 1
kill -${_sig} ${_pid}
}
The call to pgrep that from a failing run can be pgrep -P 9651
prints out all processes on the machine, and the scripts hangs when it try to kill pid 0.
But why would it get all processes ? When the run is done process 9651 is still running and if I issue the command on the command line pgrep -P 9651
there is no output at all (which is expected since this process should have no children).
I added a debug call to print the process tree right before listing children:
+ pstree='-+= 00001 root /sbin/launchd
\-+= 09774 root /usr/sbin/sshd -i
\-+- 09777 jenkins /usr/sbin/sshd -i
\-+= 09783 jenkins bash -c cd '\''/var/jenkins'\'' && java -jar slave.jar
\-+- 09784 jenkins /usr/bin/java -jar slave.jar
\-+- 09807 jenkins /Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/jre/bin/java -classpath/
\-+- 09817 jenkins /Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/jre/bin/java -
\--- 09828 jenkins sleep 10'
Looks normal to me, the sleep 10
have no children.
Any ideas – I am a bit stuck after having tried to debug this for some hours ?
The process that is trying to be killed is in this case a simple sleep 10
that is used for testing.
Best Answer
Where are you getting your version of pgrep from?
The version I have from MacPorts is coded such that if you do not supply a pattern it will match all processes even if you have qualifiers such as the
-P
option.When I issue
pgrep -P<ppid>
I get a full list of processes. If add a pattern as inpgrep -P<ppid> \.
then it works as expected only providing processes with the given ppid.As to the behavior difference, maybe you have a couple versions of pgrep on your machine and the jenkins jobs have a different PATH so are finding a different version?
From terminal window you can look for multiple versions with:
I also suggest you compare the PATH variable used in the job vs. interactive.
To view which file the shell will use, you can use
type -p pgrep
andtype -a pgrep
will show all places in the PATH where pgrep can be found.