Linux – Suggestions needed to debug why ps -ef gets stuck

linuxprocesssusetroubleshooting

A few of my processes consume 100% cpu. I'm trying to figure out which scripts are causing it

I tried running strace ps -ef:

open("/proc/PID/status", O_RDONLY) = 6
read(6, "Name:\textract\nState:\tR (running)"..., 1023) = 1023
close(6) = 0
open("/proc/PID/cmdline", O_RDONLY) = 6
read(6,

So it gets stuck trying to read /proc/PID/cmdline. I tried catting that, and it got stuck again. Something is obviously screwed in the kernel; what should I try next?

Note: rebooting doesn't work — if I shutdown manually the problem starts again. I'm using SUSE Linux Enterprise Server 11 (x86_64), Linux 2.6.27.19


Edit: ps -e produces output, and I found there are too many greps. The number of greps varies: 250, 450, and now I see around 520 greps. I traced back and found it is the result of a cron script. I still have to understand those cron scripts. Yes, top displays results. We manually shutdown the server 2 days back. System has been running from last 2 days. I see some oracle stuff running all the time. I just did the memory test, no faults detected

Best Answer

Had that just yesterday. The problem was, one process was in "uninterruptible sleep" state, shown as status D in top. ls /proc/ does not return and is not abortable. ps -ef does not return and is not abortable.

If rebooting does not help you probably have a bad sector on your DVD or hard disk and the process PID is trying to read there during startup. So technically rebooting helps, but the error re-occurs automatically.

Check with top if the process is indeed in status D, then go on from there. Boot the computer without calling this process (rescue system). Then start the program stracing it and see which files it accesses. I bet one file has bad sectors.

Related Question