Pipe – Does Tee Slow Down Pipelines?

pipetee

I am wondering whether tee slows down pipelines. Writing data to disk is slower than piping it along, after all.

Does tee wait with sending data through to the next pipe until it has been written to disk? (If not, I guess tee has to queue data that has been sent along, but not written to disk, which sounds unlikely to me.)

$ program1 input.txt | tee intermediate-file.txt | program2 ...

Best Answer

Yes, it slows things down. And it basically does have a queue of unwritten data, though that's actually maintained by the kernel—all programs have that, unless they explicitly request otherwise.

For example, here is a trivial pipe using pv, which is nice because it displays transfer rate:

$ pv -s 50g -S -pteba /dev/zero | cat > /dev/null 
  50GiB 0:00:09 [ 5.4GiB/s] [===============================================>] 100%

Now, let's add a tee in there, not even writing an extra copy—just forwarding it along:

$ pv -s 50g -S -pteba /dev/zero | tee | cat > /dev/null 
  50GiB 0:00:20 [2.44GiB/s] [===============================================>] 100%            

So, that's quite a bit slower, and it wasn't even doing anything! That's the overhead of tee internally copying STDIN to STDOUT. (Interestingly, adding a second pv in there stays at 5.19GiB/s, so pv is substantially faster than tee. pv uses splice(2), tee likely does not.)

Anyway, let's see what happens if I tell tee to write to a file on disk. It starts out fairly fast (~800MiB/s) but as it goes on, it keeps slowing down—ultimately down to ~100MiB/s, which is basically 100% of the disk write bandwidth. (The fast start is due to the kernel caching the disk write, and the slowdown to disk write speed is the kernel refusing to let the cache grow infinitely.)

Does it matter?

The above is a worst-case. The above uses a pipe to spew data as fast as possible. The only real-world use I can think of like this is piping raw YUV data to/from ffmpeg.

When you're sending data at slower rates (because you're processing them, etc.) it's going to be a much less significant effect.

Related Question