I originally asked this question on StackOverflow. Then realised that this is probably a better place.
I have bluepill setup to monitor my delayed_job processes. (Ruby On Rails application)
Using Ubuntu 12.10.
I am starting and monitoring the bluepill service itself using Ubuntu's
upstart. My upstart config is below (
description "Start up the bluepill service" start on runlevel  stop on runlevel  expect daemon exec sudo /home/deploy/.rvm/wrappers/<app_name>/bluepill load /home/deploy/websites/<app_name>/current/config/server/staging/delayed_job.bluepill # Restart the process if it dies with a signal # or exit code not given by the 'normal exit' stanza. respawn
I have also tried with
expect fork instead of
expect daemon. I have also tried removing the
expect... line completely.
When the machine boots, bluepill starts up fine.
$ ps aux | grep blue root 1154 0.6 0.8 206416 17372 ? Sl 21:19 0:00 bluepilld: <app_name>
The PID of the bluepill process is 1154 here. But
upstart seems to be tracking the wrong PID. It is tracking a PID which does not exist.
$ initctl status bluepill bluepill start/running, process 990
I think it is tracking the PID of the
sudo process which started the bluepill process.
This is preventing the bluepill process from getting respawned if I forcefully kill bluepill using
Moreover, I think because of the wrong PID being tracked, reboot / shutdown just hangs and I have to hard reset the machine every time.
What could be the issue here?
The problem remains as of today (3 May 2015) on Ubuntu 14.04.2 .
The problem is not because of using sudo. I am not using sudo anymore. My updated upstart config is this:
description "Start up the bluepill service" start on runlevel  stop on runlevel  # Restart the process if it dies with a signal # or exit code not given by the 'normal exit' stanza. respawn # Give up if restart occurs 10 times in 90 seconds. respawn limit 10 90 expect daemon script shared_path=/home/deploy/websites/some_app/shared bluepill load $shared_path/config/delayed_job.bluepill end script
When the machine boots, the program loads up fine. But upstart still tracks the wrong PID, as described above.
The workaround mentioned in the comments may fix the hanging issue. I haven't tried it, though.