Bash – Understanding inotifywait, pipes and buffers

bashbufferinotifypipe

I want to monitor every file change in a directory with inotifywait. inotifywait shall write to a FIFO buffer, which then can be read, leisurely. While experimenting with relatively huge amounts of events I encountered some bottlenecks, which I'd like to understand.

The Changes are always caused by touch {0000..9999}testfile. The bottlenecks are in the form of not catching all the file events.

When I redirect inotifywait's output to a file, everything gets logged as it should be.

inotifywait -q -m ./ writing to the Terminal catches CREATE, OPEN, ATTRIB, CLOSE for about 5000 to 8000 files. I guess the "write to screen" is not fast enough to be non-blocking?

If I pipe to cat (inotifywait... | cat | ... | cat), I finally get them all, at some point. So I guess the pipes are kind of buffering, but I don't really understand how this works, or even what to look up. Could someone, please explain this?

I also played with a "solution" I found here. Using pv -q -B 1g as buffer (also buffer).

inotifywait -q -m ./ | BUFFER | \
while read line; do
   # Do something with $line or ...
   sleep 1
done

How can make sure every file event can be processed? I have a feeling my little play about bash voodoo found some deeper constraints, where I'd like to have more insights into.

Best Answer

If the output of inotifywait -q -m ./ is not redirected and you're running it in a terminal emulator, the output will go to a pty device. A pty device is a form of interprocess communication, a bit like a pipe though with added features to facilitate terminal-like interactions.

At the other end of that pty "pipe", your terminal emulator will read what inotifywait writes and render it on the screen. Doing that rendering is complicated and expensive in CPU time.

If your terminal emulator is slower to empty that pipe than inotifywait is to fill it up, then the pty pipe will get full. When it is full, like for pipes, the writing process blocks (the write() system calls doesn't return) until there's free space again in the "pipe".

With my version of Linux, I find that I can write 19457 bytes to a pty device with nothing reading at the other end before it blocks if I write 1 byte at a time:

$ socat -u  'exec:dd bs=1 if=/dev/zero,pty' 'exec:sleep inf,nofork' &
[1] 1247815
$ pkill -USR1 -x dd
19458+0 records in
19457+0 records out
19457 bytes (19 kB, 19 KiB) copied, 14.7165 s, 1.3 kB/s

19458 bytes if I write 2 bytes at a time, 19712 if I write 256 bytes at a time, and different values if I put the terminal in raw mode or include newlines in the data I send (as they get transformed to CRLFs).

In any case, I don't think that buffer size is customizable.

inotifywait uses the inotify API to retrieve that list of events. In the inotify(7) man page, you'll find:

The following interfaces can be used to limit the amount of kernel memory consumed by inotify:

  • /proc/sys/fs/inotify/max_queued_events

    The value in this file is used when an application calls inotify_init(2) to set an upper limit on the number of events that can be queued to the corresponding inotify instance. Events in excess of this limit are dropped, but an IN_Q_OVERFLOW event is always generated.

When inotifywait is blocked on the write() to standard output, it can't process the events put on that queue by the kernel. If that queue itself gets full, events are discarded.

On my system,

$ sysctl fs.inotify.max_queued_events
fs.inotify.max_queued_events = 16384

Now, when you do:

inotifywait -q -m ./ | cat

This time, we have a pipe in between inotifywait and cat and a pty between cat and your terminal emulator.

pipes have a larger buffer than ptys (64KiB by default on Linux, though can be raised on a per-pipe basis up to fs.pipe-max-size sysctl value (1MiB by default) using fcntl(fd, F_SETPIPE_SZ, newsize)).

So before inotifywait's write() blocks, we need to fill up both those buffers. Plus, cat will also have read some data in its own reading buffer, and waiting to write it itself.

For each | cat you add, you add extra buffering space (at least 64KiB more).

With pv -q -B 1g, pv will buffer data internally.

Those cat and pv will be quicker at reading their input than your terminal emulator, because they need to do far less work to process it, but if inotifywait is not quick enough to read/decode/format events, some can still be dropped.

To minimize the chance of events being dropped, you can:

  • increase fs.inotify.max_queued_events
  • avoid sending inotifywait output to slow consumers or add sufficient buffering if you do
  • tune inotifywait filters to only select events you're interested in.
  • make sure inotifywait and the consumers of its output are not given a low priority (no niceing them).
Related Question