Reading a named pipe: tail or cat

catfifopipetail

I made a file descriptor using

mkfifo fifo

As soon as something is written to this pipe, I want to reuse it immediately. Should I use

tail -f fifo

or

while true; do cat fifo; done

?

They seem to do the same thing and I could not measure a difference in performance. However, when a system does not support inotify (Busybox, for example), the former needs to be

tail -f -s 0 fifo

But this eats up the CPU with 100% usage (test it out: mkfifo fifo && busybox tail -f -s 0 fifo & echo hi>fifo / cancel with fg 1 and CtrlC). So is the while-true-cat the more reliable solution?

Best Answer

When you do:

cat fifo

Assuming no other process has opened the fifo for writing yet, cat will block on the open() system call. When another process opens the file for writing, a pipe will be instantiated and open() will return. cat will call read() in a loop and read() will block until some other process writes data to the pipe.

cat will see end-of-file (eof) when all the other writing processes have closed their file descriptor to the fifo. At which points cat terminates and the pipe is destroyed¹.

You'd need to run cat again to read what will be written after that to the fifo (but via a different pipe instance).

In:

tail -f file

Like cat, tail will wait for a process to open a file for writing. But here, since you didn't specify a -n +1 to copy from the beginning, tail will need to wait until eof to find out what the last 10 lines were, so you won't see anything until the writing end is closed.

After that, tail will not close its fd to the pipe which means the pipe instance won't be destroyed, and will still attempt to read from the pipe every second (on Linux, that polling can be avoided via the use of inotify and some versions of GNU tail do that there). That read() will return with eof (straight away, which is why you see 100% CPU with -s 0 (which with GNU tail means to not wait between read()s instead of waiting for one second)) until some other process opens the file again for writing.

Here instead, you may want to use cat, but make sure the pipe instance always stays around after it has been instantiated. For that, on most systems, you could do:

cat 0<> fifo # the 0 is needed for recent versions of ksh93 where the
             # default fd changed from 0 to 1 for the <> operator

cat's stdin will be open for both reading and writing which means cat will never see eof on it (it also instantiates the pipe straight away even if there's no other process opening the fifo for writing).

On systems where that doesn't work, you can do instead:

cat < fifo 3> fifo

That way, as soon as some other process opens the fifo for writing, the first read-only open() will return, at which point the shell will do the write-only open() before starting cat, which will prevent the pipe from ever being destroyed again.

So, to sum up:

  • compared to cat file, it would not stop after the first round.
  • compared to tail -n +1 -f file: it would not do a useless read() every second after the first round, there would never be eof on the one instance of the pipe, there would not be that up to one second delay when a second process opens the pipe for writing after the first one has closed it.
  • compared to tail -f file. In addition to the above, it would not have to wait for the first round to finish before outputting something (only the last 10 lines).
  • compared to cat file in a loop, there would be only one pipe instance. The race windows mentioned in ¹ would be avoided.

¹ at this point, in between the last read() that indicates eof and cat terminating and closing the reading end of the pipe, there is actually a small windows during which a process could open the fifo for writing again (and not be blocked as there's still a reading end). Then, if it writes something after cat has exited and before another process opens the fifo for reading, it would get killed with a SIGPIPE.

Related Question