Shell – How to run no more than n parallel subshells

parallelismshell-scriptsubshell

I am trying to run subscripts from a main script, but I want to make sure than no more than n subscripts run at the same time.

The following simplified example illustrates.

Each subscript creates a dummy file in RAM (/dev/shm/) with a name made of a unique timestamp, and deletes it once done.

The main script counts the number of dummy files in /dev/shm/ originating from the subscripts, and doesn't (shouldn't) launch a new subscript if 2 or more of them are already running.

However, the main script seems to ignore the while condition, and launches all 5 subscripts at once.

What's wrong with my code?

mainscript.txt

#!/bin/bash
for counter in $(seq 1 5)
do
        while [ $(ls -1 /dev/shm/|grep "script044"|wc -l) -ge 2 ]
        do
                sleep 0
        done

        xterm -e "bash script044.txt" &
done

exit

script044.txt (subscript)

#!/bin/bash

tempfilename="script044_"$(date +%Y%m%d%H%M%S)_${RANDOM}
echo > /dev/shm/${tempfilename}

for counter in $(seq 1 $(shuf -i 10-45 -n 1))
do
        sleep 1
        printf "${counter}\r"
done

rm /dev/shm/${tempfilename}

exit

Best Answer

(Convention - .txt are just plain text files. .sh files are shell script files.).

Your mainscript.txt script has a race condition. Specifically the while loop starts its next iteration before the script044.txt script is able to create the temporary file. In fact the whole loop is iterated through before any of these files get created.

A more robust way to deal with this sort of thing is to forget the temporary files and use the shell builtin wait instead:

#!/bin/bash

pid_count=0
for counter in $(seq 1 5)
do
    xterm -e "bash script044.txt" &
    if (( ++pid_count > 2 )); then
        wait -n
        ((pid_count--))
    fi
done

This increments a counter every time a subprocess is started. If the counter is greater than 3 then we wait for the next subprocess to finish. When wait returns, we then decrement the counter and go around again to start the next xterm.

You can remove all the tempfilename-related lines from the script044.txt - they are no longer needed.


As @chepner points out, the required -n option is only available in bash 4.3 or later.

Related Question