I've found answers close to this but fail to understand how to use them in my case (I'm rather new to Bash)… so, I'm trying to process a folder containing a large image sequence (100k+ files) with Imagemagick and would like to use GNU Parallel to speed things up.
This is the code I use (processing 100 frames at a time to avoid running out of ram):
calcmethod1=mean;
allframes=(*.png)
cd out1
for (( i=0; i < "${#allframes[@]}" ; i+=100 )); do
convert "${allframes[@]:i:100}" -evaluate-sequence "$calcmethod1" \
-channel RGB -normalize ../out2/"${allframes[i]}"
done
how would I 'parallelize' this? Most solutions I've found work with not using a loop but piping – but doing this I've run into the problem that my script would break because of my arguments list getting too long…
I guess what I would want to do is to have parallel
splitting the load like handing the first 100 frames to core 1, frames 100-199 to core 2 etc.?
Best Answer
Order
Your sample program did not seem to care about the order of the
*.png
for theallframes
array that you were constructing, but your comments led me to believe that order would matter.Bash
Therefore I'd start with a modification to your script like so, changing the construction of the
allframes
array so that the files are stored in numeric order.This can be simplified further to this using
sort -zV
:This has the effect on constructing your
convert ...
commands so that they look like this now:Parallels
Building off of eschwartz's example I put together a
parallel
example as follows:again, more simply using
sort -zV
:NOTE: The above has an echo "..." as the
parallel
action to start. Doing it this way helps to visualize what's happening:If you're satisfied with this output, simply remove the
--dryrun
switch toparallel
, and rerun it.References