Parallel Tasks – How to Run Four Tasks in Parallel

background-processparallelismshell

I have a bunch of PNG images on a directory. I have an application called pngout that I run to compress these images. This application is called by a script I did. The problem is that this script does one at a time, something like this:

FILES=(./*.png)
for f in  "${FILES[@]}"
do
        echo "Processing $f file..."
        # take action on each file. $f store current file name
        ./pngout -s0 $f R${f/\.\//}
done

Processing just one file at a time, takes a lot of time. After running this app, I see that the CPU is just 10%. So I discovered that I can divide these files in 4 batches, put each batch in a directory and fire 4, from four terminal windows, four processes, so I have four instances of my script, at the same time, processing those images and the job takes 1/4 of the time.

The second problem is that I lost time dividing the images and batches and copying the script to four directories, open 4 terminal windows, bla bla…

How do that with one script, without having to divide anything?

I mean two things: first how do I from a bash script, fire a process to the background ? (just add & to the end?) Second: how do I stop sending tasks to the background after sending the fourth tasks and put the script to wait until the tasks end? I mean, just sending a new task to the background as one tasks end, keeping always 4 tasks in parallel? if I do not do that the loop will fire zillions of tasks to the background and the CPU will clog.

Best Answer

If you have a copy of xargs that supports parallel execution with -P, you can simply do

printf '%s\0' *.png | xargs -0 -I {} -P 4 ./pngout -s0 {} R{}

For other ideas, the Wooledge Bash wiki has a section in the Process Management article describing exactly what you want.

Related Question