MacOS – odd behavior of pgrep in bash script

bashmacosterminal

I have a script to kill a process tree that works fine in linux but I am experiencing some odd behaviour on osx. It actually works fine with my unit tests and also when trying to run it manually on osx, but for some reason when it runs as a jenkins job it acts differently.

So this is the current bash function with a bit of debug echo and sleeps:

killtree() {
  local _pid=$1
  local _sig=${2:--TERM}
  echo "Stopping ${_pid}"
  sleep 1
  kill -stop ${_pid} # stop parent to avoid creation of new children
  children=`pgrep -P ${_pid}`
  echo "Children=$children"
  sleep 1
  for _child in $children; do
      killtree ${_child} ${_sig}
  done
  echo "Killing child ${_pid}"
  sleep 1
  kill -${_sig} ${_pid}
}

The call to pgrep that from a failing run can be pgrep -P 9651 prints out all processes on the machine, and the scripts hangs when it try to kill pid 0.

But why would it get all processes ? When the run is done process 9651 is still running and if I issue the command on the command line pgrep -P 9651 there is no output at all (which is expected since this process should have no children).

I added a debug call to print the process tree right before listing children:

+ pstree='-+= 00001 root /sbin/launchd
 \-+= 09774 root /usr/sbin/sshd -i
   \-+- 09777 jenkins /usr/sbin/sshd -i
     \-+= 09783 jenkins bash -c cd '\''/var/jenkins'\'' && java  -jar slave.jar
       \-+- 09784 jenkins /usr/bin/java -jar slave.jar
         \-+- 09807 jenkins /Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/jre/bin/java -classpath/     
          \-+- 09817 jenkins /Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/jre/bin/java -
            \--- 09828 jenkins sleep 10'

Looks normal to me, the sleep 10 have no children.

Any ideas – I am a bit stuck after having tried to debug this for some hours ?

The process that is trying to be killed is in this case a simple sleep 10 that is used for testing.

Best Answer

Where are you getting your version of pgrep from?

The version I have from MacPorts is coded such that if you do not supply a pattern it will match all processes even if you have qualifiers such as the -P option.

When I issue pgrep -P<ppid> I get a full list of processes. If add a pattern as in pgrep -P<ppid> \. then it works as expected only providing processes with the given ppid.

As to the behavior difference, maybe you have a couple versions of pgrep on your machine and the jenkins jobs have a different PATH so are finding a different version?

From terminal window you can look for multiple versions with:

mdfind -name pgrep

I also suggest you compare the PATH variable used in the job vs. interactive.

To view which file the shell will use, you can use type -p pgrep and type -a pgrep will show all places in the PATH where pgrep can be found.