The mighty strace
has let me down. How is this possible?
time foo
shows that foo
takes several seconds to run ("real"), but uses negligible cpu time, both in userspace ("user") and in the kernel ("sys"). For the curious, foo
is defined below.
So it spends most of its time waiting for something else, not executing CPU instructions. Normally, I can see how it is waiting in strace
– i.e. what system call is blocking for a long period of time. Unfortunately this approach didn't work.
strace -ttt -T -C -w foo
shows system calls, timestamped, and a summary of the (real) time spent in system calls. But this particular process showed as spending negligible overall (real) time inside system calls.
foo
is actually journalctl -b -u dev-hugepages.mount
. Except that I had to change the last argument to a different systemd unit each time in order to reproduce this. In other words, the delay I am investigating happened the first time that I try to get the logs for any one systemd unit. EDIT: after answering the main question, I also realized the reason I was having this problem reproducing the delay.
The time spent by this process is a specific issue, apparently it does not occur on all systems. https://github.com/systemd/systemd/issues/7963
Best Answer
The usual reason for hitting this issue, is that the process is blocking in page faults. These are reads or possibly writes to files performed through a memory mapping aka
mmap()
. You may have noticed somemmap()
in the trace of system calls.If you had used the
/usr/bin/time
program instead of thetime
shell builtin, you might also have noticed:major
pagefaults are the ones that require filesystem IO.minor
pagefaults are much less significant (probably only a "TLB miss").I suspect
inputs
are the total number of pages read. Currently, I think file mapped pages are always the same size. 4096 bytes in most cases, but you can checkgetconf PAGESIZE
.So this represents ~290 megabytes, read at something over 100 megabytes per second, a standard speed for a hard disk like mine. Mystery solved!
Note also, you are assuming that you have a whole free CPU for this process. Otherwise, the process could simply be blocked waiting for other processes to yield the CPU.
strace
only shows when the process enters (and then leaves) the kernel due to a system call. Or when a unix signal is delivered. However there are other types of interrupts whichstrace
does not show at all. So these include