I encountered a behaviour that I don't understand while testing a script that sums the outputs from repeated executions of a program. To reproduce it create the text files out
, which represents the output of my program, and sum
, the file that holds the sum of the values returned on previous executions and which starts out as a copy of out
,
cat > out << EOF
2 20
5 50
EOF
cp out sum
The strange thing happens on running
paste out sum | awk '{$1 += $3; $2 += $4; NF = 2; print}' | tee sum
several times (15-20 times might be needed). Each time it runs, this command should add to the values in sum
the corresponding values in out
and write the results back to sum
. What I get is that it works an unpredictable number of times, then sum
reverts back to
2 20
5 50
I have later learned that I cannot redirect or tee output to the same file I'm working on and solved the issue using a temporary file, still, this behaviour baffles me:
-
why does
… | tee sum
work at all (even if only for a limited number of iterations), while… > sum
never overwritessum
? -
why doesn't it work a predictable number of times?
Best Answer
This,
has a race condition.
paste
openssum
to read it, andtee
opens it for writing, truncating it. The shell starts both at approximately the same time, so it's up to chance which one gets to open the file first.Of course in practice, the shell has to start the utilities one at a time, in some particular order. It probably does that from left to right, so
paste
might have a better chance of going first, but that's an implementation detail, and in any case the OS scheduler decides what runs when.If
paste
gets to go first, it opens the file with the data still intact, and probably has enough time to read the data too. Iftee
gets to open the file beforepaste
has read it, thenpaste
sees an empty file instead.Here,
The shell opens
sum
for writing, truncating it. It might do that in parallel to startingpaste
, but since truncatingsum
doesn't involve starting another utility, it probably happens first. (I'm not exactly sure if there's a rule about the order of processing redirections and starting the commands in a pipeline like this, but I wouldn't count on it.)There's a tool called
sponge
to fix this issue (and a dozen questions about it). It collects the input it gets and only writes it after the input is closed. This should havesum
updated correctly, always: