GNU Parallel, without any command line options, allows you to easily parallelize a command whose last argument is determined by a line of STDIN:
$ seq 3 | parallel echo
2
1
3
Note that parallel
does not wait for EOF on STDIN before it begins executing jobs — running yes | parallel echo
will begin printing infinitely many copies of y
right away.
This behavior appears to change, however, if STDIN is relatively short:
$ { yes | ghead -n5; sleep 10; } | parallel echo
In this case, no output will be returned before sleep 10
completes.
This is just an illustration — in reality I'm attempting to read from a series of continually generated FIFO pipes where the FIFO-generating process will not continue until the existing pipes start to be consumed. For example, my command will produce a STDOUT stream like:
/var/folders/2b/1g_lwstd5770s29xrzt0bw1m0000gn/T/tmp.PFcggGR55i
/var/folders/2b/1g_lwstd5770s29xrzt0bw1m0000gn/T/tmp.UCpTBzI3J6
/var/folders/2b/1g_lwstd5770s29xrzt0bw1m0000gn/T/tmp.r2EmSLW0t9
/var/folders/2b/1g_lwstd5770s29xrzt0bw1m0000gn/T/tmp.5TRNeeZLmt
Manually cat
-ing each of these files one at a time in a new terminal causes the FIFO-generating process to complete successfully. However, running printfifos | parallel cat
does not work. Instead, parallel
seems to block forever waiting for input on STDIN — if I modify the pipeline to printfifos | head -n4 | parallel cat
, the deadlock disappears and the first four pipes are printed successfully.
This behavior seems to be connected to the --jobs|-j
parameter. Whereas { yes | ghead -n5; sleep 10; } | parallel cat
produces no output for 10 seconds, adding a -j1
option yields four lines of y
almost immediately followed by a 10 second wait for the final y
. Unfortunately this does not solve my problem — I need every argument to be processed before parallel
can get EOF from reading STDIN. Is there any way to achieve this?
Best Answer
A bug in GNU Parallel does, that it only starts processing after having read one job for each jobslot. After that it reads one job at a time.
In older versions the output will also be delayed by the number of jobslots. Newer versions only delay output by a single job.
So if you sent one job per second to
parallel -j10
it would read 10 jobs before starting them. Older versions you would then have to wait an additional 10 seconds before seeing the output from job 3.A workaround the limitation at start is to feed one dummy job per jobslot to parallel:
A workound the output is to use
--linebuffer
(but this will mix full lines from different jobs).