In an answer over on Stack Overflow, I provided a code sample to perform some small task referenced in the question. The original question had to do with the fastest-performing technique (so performance criteria are in play, here).
Another commenter/answerer suggested that making a POSIX-defined system API call (in this case, readdir
) was not as fast as making a direct system call into the kernel (syscall(SYS_getdents,...)
) and the claimed performance difference is in the 25% range. (I didn't implement and re-benchmark; I believe that the performance could in fact be better.)
My question is about the performance characteristics of the proposed syscall-based solution and why they might be faster. I can think of a few reasons why performance might be better:
- POSIX
readdir
is inherently more complicated thansyscall(SYS_getdents,...)
/getdents()
readdir
(which presumably callssyscall(SYS_getdents,...)
simply adds the overhead of indirectionreaddir
only returns one record (per kernel-call) versussyscall(SYS_getdents,...)/
getdents()` which returns (presumably) more than one record per kernel-call
I can't imagine that #1 above is true. readdir
and getdents
are so similar that the implementation of readdir
in glibc simply can't have many more "true" system calls than a direct-invocation of syscall(SYS_getdents,...)
/getdents()
would invoke.
I can't imagine that #2 is true, either, since calling readdir
likely wraps getdents
and also syscall(SYS_getdents,...)
likely calls getdents
as well (the proposed answer specifically uses syscall(SYS_getdents,...)
instead of calling getdents
directly. It's possible that everything within glibc on Linux boils down to syscall(syscallid, args)
in which case #2 probably is true.
The last possibility seems to me to be the best explanation: fewer calls into the kernel simply results in faster performance.
Is there any specific explanation for why a "direct kernel call" would be measurably faster than calling a POSIX-defined function?
Best Answer
Factors like
PLT
indirection orsyscall()'s
variadic-ness (registers have to be saved to memory) should play little role given thatgetdents
is one of the most expensive calls in Linux.Fully reading an empty directory on my machine costs about 5µs, a 100-item directory 37µs, a 1000-item directory 340µs and a 10,000-item directory 3.79ms.
What
fdopendir
+readdir
does on top ofgetdents
is it adds a buffer allocation/deallocation (0.1µs) and astat
check that the supplied fd is of the directory variety (0.4µs).readdir
then makes one cheap call per directory entry (moves a position in the buffer and possibly refills).So there is something like a one-time overhead of 0.5µs, which is 10% of the directory scanning time for empty directories, but only 1% for 100-item directories and negligibly little for larger ones. This overhead is 5 times less (only the allocation/deallocation cost) if you you don't need fdopen. (you only need fdopen if you can't
diropen
directly and therefore must go through a separately obtained (e.g.,openat
'ted) filedescriptor).So if you use a custom one-time allocated buffer along with
getdents
, you can save 2-10% on the scanning cost of empty directories and negligibly little on the bigger ones.As for the
readdir
calls, the cost of PLT indirection on modern hardware is typically less than a 1ns, function call overhead is about 1-2ns. Given that the directory scanning times are in the order of microseconds, you'd need to make at least 1000readdir
calls for these factors to translate into a single µs, but then the scanning cost is 340µs and the accrued 1 extra µs is like 0.3% of that -- negligible effect. Inlining thesereaddir
(thus eliminating the call overhead and the PLT overhead) would only serve to expand the code, but it wouldn't much improve performance asgetdents
is very much the bottleneck there.(
readdir_r
is more expensive because of additional locking, but you shouldn't needreaddir_r
because plainreaddir
calls are typically thread-safe unless you have multiple threads calling them on the same directory stream. POSIX might not be explicit about this yet, but I believe this guarantee should be getting standardized soon given that glibc went as far as to deprecatereaddir_r
.)