Under Linux, I have a process that is blocked in uninterruptible sleep (state D). How can I investigate what's causing this?
I am running an “ordinary” kernel (a Debian build), without any special debugging features.
There is no relevant log entry — in fact nothing got logged between the time the process started and the time I noticed it.
strace
can't even attach to the process since it's in uninterruptible sleep. And even if I knew what system call was called, that wouldn't necessarily help me. I need to know what's going on inside the kernel.
Specifically, the sync
command goes into uninterruptible sleep 🙁 So I must have an I/O problem somewhere but all my filesystems appear to work normally. There may well be an old log entry about an I/O error but I can't find it (this machine hasn't rebooted in a long time, that's a lot of log entries). Can I at least know which subsystem is blocking sync
? For example, get a kernel backtrace for the kernel thread corresponding to a particular PID/TID?
(I'm sure that rebooting would either fix this or reveal the error but I'm asking how to investigate this, not how to blindly press a button.)
Best Answer
It's a bit late, but it could be helpful for others.
What I did:
cat /proc/PID/stack
to get some direction. In my case it was connected with inode and filesystem:cat /proc/PID/syscall
to get current system call:3 stands for close syscall, 6 is file descriptor (first argument of syscall). It was trying to call
close(6)
.lsof -p PID
, but there wasn't my descriptor.lsof
. It was my case.Goodluck