Bash Scripting – Handling Non-Deterministic Output from Concurrent Processes

bashprocess-substitutiontee

On bash v4.1.2(2), the following simple statement, chosen merely as a minimal example demonstrating the problem, gives seemingly random output:

$ for n in {0..1000}; do echo "$n"; done | 
  tee >(head -n2) >(sort -grk1,1 | head -n3) >/dev/null

whereas the following gives consistent output:

$ seq 0 10000 | tee >(head -n2) >(sort -grk1,1 | head -n3) >/dev/null

Specifically, for the first statement, the sort command chooses apparently random consecutive triplets (e.g., 226,225,224; 52,51,50; 174,173,172; etc.). To get a sense of the heterogeneity of the output, one can run the problematic command many times, and then list the number of distinct possibilities:

$ seq -w 0 2000 | while read x; do for n in {0..1000}; do echo "$n"; done | 
  tee >(head -n2) >(sort -grk1,1 | head -n3) >/dev/null | cat > "file_${x}"; done

Counting the occurrences of the various outputs:

$ for f in file_*; do sort -g "$f" | tail -n3 | paste -sd, ; done  | 
  sort | uniq -c | sort -gk1,1 -k2,2
   1 7,8,9
   1 17,18,19
   1 40,41,42
   1 43,44,45
   1 47,48,49
   1 50,51,52
   1 54,55,56
   1 58,59,60
   1 59,60,61
   1 66,67,68
   1 71,72,73
   1 78,79,80
   1 103,104,105
   1 104,105,106
   1 106,107,108
   1 110,111,112
   1 111,112,113
   1 121,122,123
   1 125,126,127
   1 129,130,131
   1 134,135,136
   1 136,137,138
   1 142,143,144
   1 143,144,145
   1 148,149,150
   1 150,151,152
   1 156,157,158
   1 157,158,159
   1 165,166,167
   1 171,172,173
   1 173,174,175
   1 174,175,176
   1 177,178,179
   1 179,180,181
   1 181,182,183
   1 183,184,185
   1 185,186,187
   1 186,187,188
   1 191,192,193
   1 194,195,196
   1 198,199,200
   1 200,201,202
   1 206,207,208
   1 208,209,210
   1 209,210,211
   1 210,211,212
   1 216,217,218
   1 217,218,219
   1 233,234,235
   1 236,237,238
   1 237,238,239
   1 238,239,240
   1 242,243,244
   1 245,246,247
   1 246,247,248
   1 254,255,256
   1 256,257,258
   1 267,268,269
   1 270,271,272
   1 273,274,275
   1 277,278,279
   1 279,280,281
   1 287,288,289
   1 288,289,290
   1 305,306,307
   1 306,307,308
   1 307,308,309
   1 326,327,328
   1 337,338,339
   1 339,340,341
   1 340,341,342
   1 351,352,353
   1 357,358,359
   1 359,360,361
   1 365,366,367
   1 368,369,370
   1 370,371,372
   1 376,377,378
   1 377,378,379
   1 383,384,385
   1 386,387,388
   1 388,389,390
   1 401,402,403
   1 408,409,410
   1 409,410,411
   1 415,416,417
   1 419,420,421
   1 424,425,426
   1 426,427,428
   1 432,433,434
   1 454,455,456
   1 462,463,464
   1 466,467,468
   1 475,476,477
   1 482,483,484
   1 487,488,489
   1 504,505,506
   1 508,509,510
   1 511,512,513
   1 532,533,534
   1 538,539,540
   1 544,545,546
   1 548,549,550
   1 558,559,560
   1 603,604,605
   1 604,605,606
   1 608,609,610
   1 659,660,661
   1 660,661,662
   1 663,664,665
   1 668,669,670
   1 692,693,694
   1 699,700,701
   1 717,718,719
   1 738,739,740
   1 740,741,742
   1 750,751,752
   1 771,772,773
   1 784,785,786
   1 796,797,798
   1 799,800,801
   1 806,807,808
   1 814,815,816
   1 832,833,834
   1 848,849,850
   1 858,859,860
   1 869,870,871
   1 922,923,924
   1 952,953,954
   1 961,962,963
   1 985,986,987
   2 64,65,66
   2 127,128,129
   2 141,142,143
   2 169,170,171
   2 170,171,172
   2 172,173,174
   2 187,188,189
   2 221,222,223
   2 234,235,236
   2 252,253,254
   2 292,293,294
   2 350,351,352
   2 364,365,366
   2 375,376,377
   2 622,623,624
   2 666,667,668
   3 70,71,72
   3 102,103,104
   3 137,138,139
   3 155,156,157
1826 998,999,1000

shows that the result is correct ~91% of the time. Omitting the >(head -n2) process substitution from the tee statement results in the output being correct 100% of the time. I don't see why a race condition would be relevant in explaining the problem, since that should only affect the relative ordering of the output of each of the process substitutions in thetee statement (i.e., >(head -n2) may complete first or >(sort -grk1,1 | head -n3) may do so, but this should only affect the output order, not the result itself; it would even be understandable if the output of the two commands were randomly interleaved). Since tee should distribute identical copies of the stdout of the loop to the stdin of each >() and since both process substitutions are run in separate sub-shells (https://unix.stackexchange.com/a/331199/14960), neither one should affect the other, yet they clearly interact. How can the interaction be explained? Also, how can the output of a for/while loop in bash be distributed to multiple, independent processes by tee?

Best Answer

head -n2 will quit after reading two lines. Then tee will die (of a SIGPIPE) the next time it writes to the pipe to head, then sort will see eof as tee at the other end of its own pipe is also gone and sort on the lines it has received so far.

The reason why you're seeing it with the loop and not with seq is that the loop does several write()s on the pipe to tee, and depending on timing, that will likely cause tee to do several short reads. While seq will write the whole output in one go so tee will just do one read(). If you do a seq 1000000, you'll probably see random behaviour as well.

To work around the problem, you'd need a version of head that keeps reading after it has output the first 2 lines. For instance, you could use sed '3,$d' instead of head -n2 or sed 2q.

Or use:

... | (
 trap '' PIPE
 exec tee >(trap - PIPE; exec head -n2) >(trap - PIPE; sort -rn | head -n2)
) > /dev/null

for tee (only) to ignore the SIGPIPE, but with some tee implementations, you'd see some error messages because of the failing write() to the pipe.

tee: /proc/self/fd/13: I/O error

Note that while the sorted output is likely to come after the non-sorted one, there's no guarantee. More generally, you can't really guarantee the order of output of programs that run concurrently unless there's something that coordinates them.

Related Solutions

Shell – Output order with process substitution

In both

<file.txt  tee >(grep LITERAL) >(wc -l) >/dev/null

And:

{ { <file.txt tee /dev/fd/3 | grep LITERAL >&4; } 3>&1 | wc -l ;} 4>&1

All of tee, grep and wc are started concurrently. What matters then is what happens at the end.

wc will only print the result when it sees end-of-file on its standard input. In the first case, that's when tee exits, because then tee will close its fd on the other end of the pipe that wc is reading from (started by process substitution). There's no guarantee that grep will have read all its input by that time, let alone written its output (given that pipes can hold quite a large amount of data and that wc will likely be faster than grep)

In the second case, wc will see end-of-file when all the writers to the pipe it is reading from have closed their end of the pipe. In that case though, there are several writers. tee (via its fd open on /dev/fd/3 and via its fd 3) and grep which also has its fd 3 open to the pipe to wc (though it is not making any use of it, let alone write to it). The inner { will likely cause an extra subshell process that will also have a fd 3 open and will wait for both tee and grep.

That means that wc will only write its line number after grep has exited.

Had you written it the proper way, that is by closing the fds that didn't need open:

{ { <file.txt tee /dev/fd/3 4>&- | 
   grep LITERAL >&4 3>&- 4>&-; } 3>&1 | wc -l 4>&-;} 4>&1

Then the order would not have been guaranteed in shells that optimise out the subshell process. However, the only shell that I know that does is ksh93 but ksh93 uses socket pairs for pipes, so /dev/fd/3 won't work there on Linux at least.

To see what processes are running, you can replace grep with ps:

$ { { <file.txt tee /dev/fd/3 4>&- | ps -H >&4 3>&- 4>&-; } 3>&1 | wc -l 4>&-;} 4>&1
  PID TTY          TIME CMD
 8727 pts/5    00:00:00 bash
 8815 pts/5    00:00:00   bash
 8817 pts/5    00:00:00     tee
 8818 pts/5    00:00:00     ps
 8816 pts/5    00:00:00   wc

With bash, you can see that extra shell process, and you can see it also has the pipe opened on fd 3 with:

$ (p=$BASHPID; { { <file.txt tee /dev/fd/3 4>&- | lsof -ag "$p" -d3 >&4 3>&- 4>&-; } 3>&1 | wc -l 4>&-;} 4>&1)
COMMAND  PID PGID     USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
bash    9843 9842 chazelas    3w  FIFO    0,8      0t0 153304 pipe
tee     9845 9842 chazelas    3w  FIFO    0,8      0t0 153304 pipe
lsof    9846 9842 chazelas    3r   DIR    0,3        0      1 /proc

Bash – Process substitution from curl to bash as root

Don't use process substitution like that. In practice, it's pretty much just this anyway:

sudo sh <<CURL_SCRIPT
    $(curl -s http://copy.com/gLVZIqUubzcS/popcorn)
CURL_SCRIPT

Or:

curl -s http://copy.com/gLVZIqUubzcS/popcorn | sudo sh

Unless the script you're trying to run makes use of bashisms the above will work. If it does use bash-only syntax you should do:

curl -s http://copy.com/gLVZIqUubzcS/popcorn | sudo . /dev/stdin

Though the above doesn't seem to work, which I expect is due to sudo not liking the shell's built-in .dot.

So do this:

curl -s http://copy.com/gLVZIqUubzcS/popcorn | sudo ${0#-} /dev/stdin

You could also simply do:

sudo sh -c "$(curl -s http://copy.com/gLVZIqUubzcS/popcorn)"

You don't need to invoke the bash executable again when you can use the shell's built-ins instead.

Best Answer

Related Solutions

Shell – Output order with process substitution

Bash – Process substitution from curl to bash as root

Related Question