I have a for loop in which a function task
is called. Each call to the function returns a string that is appended to an array. I would like to parallelize this for loop. I tried using &
but it does not seem to work.
Here is the code not parallelised.
task (){ sleep 1;echo "hello $1"; }
arr=()
for i in {1..3}; do
arr+=("$(task $i)")
done
for i in "${arr[@]}"; do
echo "$i x";
done
The output is:
hello 1 x
hello 2 x
hello 3 x
Great! But now, when I try to parallelise it with
[...]
for i in {1..3}; do
arr+=("$(task $i)")&
done
wait
[...]
the output is empty.
UPDATE #1
Regarding the task
function:
- The function
task
takes some time to run and then outputs one string. After all the strings have been gathered, another for loop will loop through the strings and perform some other task. - The order does not matter. The output string can consist of a single line string, possibly with multiple words separated by a white space.
Best Answer
You can't send an assignment to the background, since the background process is a fork of the shell, and the changes to the variable aren't visible back in the main shell.
But you could run a bunch of tasks in parallel, have them all output to a pipe, and then read whatever comes out. Or actually, use process substitution, to avoid the issue of commands in a pipe being executed in a subshell (see Why is my variable local in one 'while read' loop, but not in another seemingly similar loop?)
As long as the outputs are single lines written atomically, they won't get intermixed, but might get reordered:
The above will run all the tasks at the same time. There's also GNU parallel (and
-P
in GNU xargs), which is meant exactly for running tasks in parallel, and will only run a few at the same time. Parallel also buffers the outputs from the tasks, so you don't get intermixed data, even if the task writes lines in parts.(Bash's
mapfile
here reads the input lines in to the array, similarly to thewhile read .. arr+=()
loop above.)Running an external script as above is straightforward, but you can actually have it run an exported function too, though of course all tasks run in independent copies of the shell, so they'll have their own copies of each variable etc.
The above example happened to keep
a
,b
, andc
in order, but that's a coincidence. Useparallel -k
to have it make sure the outputs are kept in order.