Ubuntu – Prevent duplicate script run at the sametime

command linecronscripts

I am using scrapy to fetch some resources, and I want to make it as a cron job which can start every 30 minutes.

The cron:

0,30 * * * * /home/us/jobs/run_scrapy.sh`

run_scrapy.sh:

#!/bin/sh
cd ~/spiders/goods
PATH=$PATH:/usr/local/bin
export PATH
pkill -f $(pgrep run_scrapy.sh | grep -v $$)
sleep 2s
scrapy crawl good

As the script shown I tried to kill the script process and the child process(scrapy) also.

However when I tried to run two of the script, the newer instance of the script does not kill the older instance.

How to fix that?


Update:

I have more than one .sh scrapy script which run at different frequency configured in cron.


Update 2 – Test for Serg's answer:

All the cron jobs have been stopped before I run the test.

Then I open three terminal windows say they are named w1 w2 and w3, and run the commands in the following orders:

Run `pgrep scrapy` in w3, which print none.(means no scrapy running at the moment).

Run `./scrapy_wrapper.sh` in w1

Run `pgrep scrapy` in w3 which print one process id say it is `1234`(means scrapy have been started by the script)

Run `./scrapy_wrapper.sh` in w2 #check the w1 and found the script have been terminated.

Run `pgrep scrapy` in w3 which print two process id `1234` and `5678`

Press `Ctrl+C` in w2(twice)

Run `pgrep scrapy` in w3 which print one process id `1234` (means scrapy of `5678` have been stopped)

At this moment, I have to use pkill scrapy to stop scrapy with id of 1234

Best Answer

Better approach would be to use a wrapper script, that will call the main script. This would look like this:

#!/bin/bash
# This is /home/user/bin/wrapper.sh file
pkill -f 'main_script.sh'
exec bash ./main_script.sh

Of course wrapper has to be named differently. That way, pkill can search only for your main script. This way your main script reduces to this:

#!/bin/sh
cd /home/user/spiders/goods
PATH=$PATH:/usr/local/bin
export PATH
scrapy crawl good

Note that in my example I am using ./ because script was in my current working directory. Use full path to your script for best results

I have tested this approach with a simple main script that just runs infinite while loop and wrapper script. As you can see in screenshot, launching second instance of wrapper kills previous

enter image description here

Your script

This is just example. Remember that I have no access to scrapy to actually test this so adjust this as needed for your situation.

Your cron entry should look like this:

0,30 * * * * /home/us/jobs/scrapy_wrapper.sh

Contents of scrapy_wrapper.sh

#!/bin/bash
pkill -f 'run_scrapy.sh'
exec sh /home/us/jobs/run_scrapy.sh

Contents of run_scrapy.sh

#!/bin/bash
cd /home/user/spiders/goods
PATH=$PATH:/usr/local/bin
export PATH
# sleep delay now is not necessary
# but uncomment if you think it is
# sleep 2
scrapy crawl good
Related Question