Iterating Over a File vs Reading into Memory – Performance

bashioperformance

I'm comparing the following

tail -n 1000000 stdout.log | grep -c '"success": true'
tail -n 1000000 stdout.log | grep -c '"success": false'

with the following

log=$(tail -n 1000000 stdout.log)
echo "$log" | grep -c '"success": true'
echo "$log" | grep -c '"success": false'

and surprisingly the second takes almost 3 times longer than the first. It should be faster, shouldn't it?

Best Answer

On the one hand, the first method calls tail twice, so it has to do more work than the second method which only does this once. On the other hand, the second method has to copy the data into the shell and then back out, so it has to do more work than the first version where tail is directly piped into grep. The first method has an extra advantage on a multi-processor machine: grep can work in parallel with tail, whereas the second method is strictly serialized, first tail, then grep.

So there's no obvious reason why one should be faster than the other.

If you want to see what's going on, look at what system calls the shell makes. Try with different shells, too.

strace -t -f -o 1.strace sh -c '
  tail -n 1000000 stdout.log | grep "\"success\": true" | wc -l;
  tail -n 1000000 stdout.log | grep "\"success\": false" | wc -l'

strace -t -f -o 2-bash.strace bash -c '
  log=$(tail -n 1000000 stdout.log);
  echo "$log" | grep "\"success\": true" | wc -l;
  echo "$log" | grep "\"success\": true" | wc -l'

strace -t -f -o 2-zsh.strace zsh -c '
  log=$(tail -n 1000000 stdout.log);
  echo "$log" | grep "\"success\": true" | wc -l;
  echo "$log" | grep "\"success\": true" | wc -l'

With method 1, the main stages are:

  1. tail reads and seek to find its starting point.
  2. tail writes 4096-byte chunks which grep reads as fast as they're produced.
  3. Repeat the previous step for the second search string.

With method 2, the main stages are:

  1. tail reads and seek to find its starting point.
  2. tail writes 4096-byte chunks which bash reads 128 bytes at a time, and zsh reads 4096 bytes at a time.
  3. Bash or zsh writes 4096-byte chunks which grep reads as fast as they're produced.
  4. Repeat the previous step for the second search string.

Bash's 128-byte chunks when reading the output of the command substitution slows it down significantly; zsh comes out about as fast as method 1 for me. Your mileage may vary depending on the CPU type and number, scheduler configuration, versions of the tools involved, and size of the data.

Related Question