Bash – How to time a pipe

bashpipeshelltime

I want to time a command which consists of two separate commands with one piping output to another. For example, consider the two scripts below:

$ cat foo.sh
#!/bin/sh
sleep 4

$ cat bar.sh
#!/bin/sh
sleep 2

Now, how can I get time to report the time taken by foo.sh | bar.sh (and yes, I know the pipe makes no sense here, but this is just an example)? It does work as expected if I run them sequentially in a subshell without piping:

$ time ( foo.sh; bar.sh )

real    0m6.020s
user    0m0.010s
sys     0m0.003s

But I can't get it to work when piping:

$ time ( foo.sh | bar.sh )

real    0m4.009s
user    0m0.007s
sys     0m0.003s

$ time ( { foo.sh | bar.sh; } )

real    0m4.008s
user    0m0.007s
sys     0m0.000s

$ time sh -c "foo.sh | bar.sh "

real    0m4.006s
user    0m0.000s
sys     0m0.000s

I've read through a similar question (How to run time on multiple commands AND write the time output to file?) and also tried the standalone time executable:

$ /usr/bin/time -p sh -c "foo.sh | bar.sh"
real 4.01
user 0.00
sys 0.00

It doesn't even work if I create a third script which only runs the pipe:

$ cat baz.sh
#!/bin/sh
foo.sh | bar.sh

And then time that:

$ time baz.sh

real    0m4.009s
user    0m0.003s
sys     0m0.000s

Interestingly, it doesn't appear as though time exits as soon as the first command is done. If I change bar.sh to:

#!/bin/sh
sleep 2
seq 1 5

And then time again, I was expecting the time output to be printed before the seq but it isn't:

$ time ( { foo.sh | bar.sh; } )
1
2
3
4
5

real    0m4.005s
user    0m0.003s
sys     0m0.000s

Looks like time doesn't count the time it took to execute bar.sh despite waiting for it to finish before printing its report¹.

All tests were run on an Arch system and using bash 4.4.12(1)-release. I can only use bash for the project this is a part of so even if zsh or some other powerful shell can get around it, that won't be a viable solution for me.

So, how can I get the time a set of piped commands took to run? And, while we're at it, why doesn't it work? It looks like time immediately exits as soon as the first command has finished. Why?

I know I can get the individual times with something like this:

( time foo.sh ) 2>foo.time | ( time bar.sh ) 2> bar.time

But I still would like to know if it's possible to time the whole thing as a single operation.

¹ _{This doesn't seem to be a buffer issue, I tried running the scripts with unbuffered and stdbuf -i0 -o0 -e0 and the numbers were still printed before the time output.}

Best Answer

It is working.

The different parts of a pipeline are executed concurrently. The only thing that synchronises/serialises the processes in the pipeline is IO, i.e. one process writing to the next process in the pipeline and the next process reading what the first one writes. Apart from that, they are executing independently of each other.

Since there is no reading or writing happening between the processes in your pipeline, the time take to execute the pipeline is that of the longest sleep call.

You might as well have written

time ( foo.sh & bar.sh &; wait )

Terdon posted a couple of slightly modified example scripts in the chat:

#!/bin/sh
# This is "foo.sh"
echo 1; sleep 1
echo 2; sleep 1
echo 3; sleep 1
echo 4

and

#!/bin/sh
# This is "bar.sh"
sleep 2
while read line; do
  echo "LL $line"
done
sleep 1

The query was "why does time ( sh foo.sh | sh bar.sh ) return 4 seconds rather than 3+3 = 6 seconds?"

To see what's happening, including the approximate time each command is executed, one may do this (the output contains my annotations):

$ time ( env PS4='$SECONDS foo: ' sh -x foo.sh | PS4='$SECONDS bar: ' sh -x bar.sh )
0 bar: sleep 2
0 foo: echo 1     ; The output is buffered
0 foo: sleep 1
1 foo: echo 2     ; The output is buffered
1 foo: sleep 1
2 bar: read line  ; "bar" wakes up and reads the two first echoes
2 bar: echo LL 1
LL 1
2 bar: read line
2 bar: echo LL 2
LL 2
2 bar: read line  ; "bar" waits for more
2 foo: echo 3     ; "foo" wakes up from its second sleep
2 bar: echo LL 3
LL 3
2 bar: read line
2 foo: sleep 1
3 foo: echo 4     ; "foo" does the last echo and exits
3 bar: echo LL 4
LL 4
3 bar: read line  ; "bar" fails to read more
3 bar: sleep 1    ; ... and goes to sleep for one second

real    0m4.14s
user    0m0.00s
sys     0m0.10s

So, to conclude, the pipeline takes 4 seconds, not 6, due to the buffering of the output of the first two calls to echo in foo.sh.

Best Answer

Related Solutions

How to get the time of several commands with a loop

Shell – How to display the number of lines output by a command in real time

Related Question