Background execution in parallel

background-processparallelism

I have a program ./pgm taking some arguments (say -a file1 -b val), which requires 2 seconds to execute. I would like to use all the processors on my machine to run this program on all the input files in parallel (about 1000). What I do now, is put all the commands

./pgm -a file1 -b 12 > out1.txt &
./pgm -a file2 -b 14 > out2.txt &
./pgm -a file3 -b 16 > out3.txt &
./pgm -a file4 -b 18 > out4.txt &
...

in a file, and execute this file. I thought this would use all the available processors, but the number of parallel execution is very limited.

How can I achieve this? Note that parallel command is not an option.

Best Answer

With GNU xargs:

seq 1000 | xargs -P4 -n1  sh -c 'exec ./pgm -a "file$1" -b 12 > "out.$1"' sh &

Would run up to 4 ./pgms in parallel.

Otherwise, with pdksh/mksh/oksh:

trap : CHLD
n=0
for f in file*; do
  jobs=$(jobs | wc -l)
  if (($jobs < 4)); then
    ./pgm "$f" > out.$((++n)) &
  else
    wait
  fi
done
trap - CHLD
wait

the details of signal handling vary from one shell to the next. That trick works in pdksh and its derivatives but not in any other shell I tried. You need a shell where one can trap SIGCHLD (excludes bash), where the SIGCHLD handler is executed straight away (not blocked during a wait) (excludes ash, yash), where the SIGCHLD handling interrupts the wait (excludes ksh93 and zsh).

In shells other than bash, you could also look at approaches where jobs are started in the SIGCHLD handler.

Related Question