The example, below, surprised me. It seems to be counter intuitive… aside from the fact that there is a whisker more user time for the echo | sed
combo.
Why is echo
using so much sys time when it runs alone, or should the question be, How does sed
change the state of play? It seems that echo
would needs to do the same echo-ing in both cases…
time echo -n a\ {1..1000000}\ c$'\n' >file
# real 0m9.481s
# user 0m5.304s
# sys 0m4.172s
time echo -n a\ {1..1000000}\ c$'\n' |sed s/^\ // >file
# real 0m5.955s
# user 0m5.488s
# sys 0m1.580s
Best Answer
bahamat and Alan Curry have it right: this is due to the way your shell buffers the output of
echo
. Specifically, your shell is bash, and it issues onewrite
system call per line. Hence the first snippet makes 1000000 writes to a disk file, whereas the second snippet makes 1000000 writes to a pipe and sed (largely in parallel, if you have multiple CPUs) makes a considerably smaller number of writes to a disk file due to its output buffering.You can observe what's going on by running strace.
Other shells such as ksh buffer the output of
echo
even when it's multiline, so you won't see much of a difference.With bash I get similar timing ratios. With ksh I see the second snippet running slower.