Bash – Explanation of Unpredictable Behavior of Tee Command

bashio-redirectiontee

I encountered a behaviour that I don't understand while testing a script that sums the outputs from repeated executions of a program. To reproduce it create the text files out, which represents the output of my program, and sum, the file that holds the sum of the values returned on previous executions and which starts out as a copy of out,

cat > out << EOF
2 20
5 50
EOF
cp out sum

The strange thing happens on running

paste out sum | awk '{$1 += $3; $2 += $4; NF = 2; print}' | tee sum

several times (15-20 times might be needed). Each time it runs, this command should add to the values in sum the corresponding values in out and write the results back to sum. What I get is that it works an unpredictable number of times, then sum reverts back to

2 20
5 50

I have later learned that I cannot redirect or tee output to the same file I'm working on and solved the issue using a temporary file, still, this behaviour baffles me:

  • why does … | tee sum work at all (even if only for a limited number of iterations), while … > sum never overwrites sum?

  • why doesn't it work a predictable number of times?

Best Answer

This,

paste out sum | awk ... | tee sum

has a race condition. paste opens sum to read it, and tee opens it for writing, truncating it. The shell starts both at approximately the same time, so it's up to chance which one gets to open the file first.

Of course in practice, the shell has to start the utilities one at a time, in some particular order. It probably does that from left to right, so paste might have a better chance of going first, but that's an implementation detail, and in any case the OS scheduler decides what runs when.

If paste gets to go first, it opens the file with the data still intact, and probably has enough time to read the data too. If tee gets to open the file before paste has read it, then paste sees an empty file instead.

Here,

paste out sum | awk ... > sum

The shell opens sum for writing, truncating it. It might do that in parallel to starting paste, but since truncating sum doesn't involve starting another utility, it probably happens first. (I'm not exactly sure if there's a rule about the order of processing redirections and starting the commands in a pipeline like this, but I wouldn't count on it.)

There's a tool called sponge to fix this issue (and a dozen questions about it). It collects the input it gets and only writes it after the input is closed. This should have sum updated correctly, always:

paste out sum | awk ... | sponge sum