Bash – Running Consecutive and Parallel Loops/Commands

bashparallelismscripting

I want to run some simulations using a Python tool that I had made. The catch is that I would have to call it multiple times with different parameters/arguments and everything.

For now, I am using multiple for loops for the task, like:

for simSeed in 1 2 3 4 5
do
    for launchPower in 17.76 20.01 21.510 23.76
    do
        python sim -a $simSeed -p $launchPower
    done
done

In order for the simulations to run simultaneously, I append a & at the end of the line where I call the simulator.

python sim -a $simSeed -p $launchPower &

Using this method I am able to run multiple such seeds. However, since my computer has limited memory, I want to re-write the above script so that it launches the inner for loop parallelly and the outer for loop sequentially.

As an example, for simSeed = 1, I want 5 different processes to run with launchPower equal to 17.76 20.01 21.510 23.76. As soon as this part is complete, I want the script to run for simSeed = 2 and again 5 different parallel processes with launchPower equal to 17.76 20.01 21.510 23.76.

How can I achieve this task?

TLDR:

I want the outer loop to run sequentially and inner loop to run parallelly such that when the last parallel process of the inner loop finishes, the outer loop moves to the next iteration.

Best Answer

GNU parallel has several options to limit resource usage when starting jobs in parallel.

The basic usage for two nested loops would be

parallel python sim -a {1} -p {2} ::: 1 2 3 4 5 ::: 17.76 20.01 21.510 23.76

If you want to launch at most 5 jobs at the same time, e.g., you could say

parallel -j5 python <etc.>

Alternatively, you can use the --memfree option to start new jobs only when enough memory is free, e.g. at least 256 MByte

parallel --memfree 256M python <etc.>

Note that the last option will kill the most recently started job if the memory falls below 50% of the "reserve" value stated (but it will be re-qeued for catch-up automatically).