Linux – Process which locks up, ignores SIGKILL, is runnable (not a zombie or in uninterruptable sleep). What state is it in

linux-kernelprocessredis

I have a process which several times now has stopped responding and appears to be completely locking up. It doesn't respond to any attempt at strace or peeking with gdb (gdb just hangs on a wait4() syscall). The process is runnable, and is not waiting on a syscall (/proc/X/syscall: running) or in uninterruptable sleep (/proc/X/status: State: R (running)).

What state is this process in exactly? Is this possibly a kernel bug of some type?

The process is redis, and this has happened a few times now. Only thing that can kill the process is a reboot, it seems. OS is Cent 7.

Edit: Kernel version is 3.10.0-123.13.2.el7.x86_64. Trying an update to 3.10.0-229.11.1.el7 to see if that makes any difference.

Best Answer

wait4 is a syscall indicating the process is waiting for one of his child termination. This may points some issue with the signal handling.

A bit brutal, but you may try to kill the hierarchy of the app : kill -15 -$YourRedisPID. The - before the PID means "the PID and its children". As it seems to be waiting for a child termination, it may unlock it.

If it's not working, let's check deeper : find your signal process status with grep ^Sig /proc/$YourRedisPID/status

You'll see some stuff like :

SigQ:   8/62777
SigPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000080
SigCgt: 0000000180004023

As defined in "fs/proc/array.c" of the kernel source, the "SigQ" is the number of signals pending / the limit of pending signals.

If the number of signal is too high, it may indicate your "SIGKILL" is not handled at all. I'm still checking the "kernel/signal.c" file to understand the signal management of these special signals.

For a direct understanding of the output, try this one-liner : awk 'BEGIN{print "ibase=16;obase=2;"} /^Sig...:/{ print toupper($2)}' /proc/$YourRedisPID/status | BC_LINE_LENGTH=0 bc

This outputs me :

0
0
10000000
110000000000000000100000000100011

Let's start by sending us this output. I'll update the post as required.

Related Solutions

Way to identify which process turns into Zombie process

The audit subsystem of the Linux kernel can be very useful to figure out what processes are becoming zombie processes. I just had the following situation:

server ~ # ps -ef --forest
[...]
root     16385     1  0 17:04 ?        00:00:00 /usr/sbin/apache2 -k start
root     16388 16385  0 17:04 ?        00:00:00  \_ /usr/bin/perl -T -CSDAL /usr/lib/iserv/apache_user
root     16389 16385  0 17:04 ?        00:00:00  \_ /usr/bin/perl -T -CSDAL /usr/lib/iserv/apache_user
www-data 16415 16385  0 17:04 ?        00:00:00  \_ /usr/sbin/apache2 -k start
www-data 18254 16415  0 17:23 ?        00:00:00  |   \_ [sh] <defunct>
www-data 18347 16415  0 17:23 ?        00:00:00  |   \_ [sh] <defunct>
www-data 22966 16415  0 18:18 ?        00:00:00  |   \_ [sh] <defunct>
www-data 16583 16385  0 17:05 ?        00:00:01  \_ /usr/sbin/apache2 -k start
www-data 18306 16583  0 17:23 ?        00:00:00  |   \_ [sh] <defunct>
www-data 18344 16583  0 17:23 ?        00:00:00  |   \_ [sh] <defunct>
www-data 17561 16385  0 17:12 ?        00:00:00  \_ /usr/sbin/apache2 -k start
www-data 22983 17561  0 18:18 ?        00:00:00  |   \_ [sh] <defunct>
www-data 18318 16385  0 17:23 ?        00:00:00  \_ /usr/sbin/apache2 -k start
www-data 19725 16385  0 17:43 ?        00:00:01  \_ /usr/sbin/apache2 -k start
www-data 22638 16385  0 18:13 ?        00:00:00  \_ /usr/sbin/apache2 -k start
www-data 22659 16385  0 18:14 ?        00:00:00  \_ /usr/sbin/apache2 -k start
www-data 25102 16385  0 18:41 ?        00:00:00  \_ /usr/sbin/apache2 -k start
www-data 25175 16385  0 18:42 ?        00:00:00  \_ /usr/sbin/apache2 -k start
www-data 25272 16385  0 18:44 ?        00:00:00  \_ /usr/sbin/apache2 -k start

The cause for these zombie processes is most probably a PHP script, but as these Apache child processes are processing lots of HTTP requests and lots of different PHP scripts, it's very hard to figure out which one could be responsible. Linux has also already deallocated important information of these zombie processes, so we don't even have /proc/<pid>/cmdline to figure out which script or -c command /bin/sh may have been running:

server ~ # cat /proc/18254/cmdline 
server ~ #

To figure it out, I've installed auditd: https://linux-audit.com/configuring-and-auditing-linux-systems-with-audit-daemon/

I set up the following audit rules:

auditctl -a always,exit -F arch=b32 -S execve -F path=/bin/dash
auditctl -a always,exit -F arch=b64 -S execve -F path=/bin/dash

These rules audit all process creations of the /bin/dash binary. /bin/sh doesn't work here, because it's a symlink and audit apparently only sees the target file name:

server ~ # ls -l /bin/sh
lrwxrwxrwx 1 root root 4 Nov  8  2014 /bin/sh -> dash*

A simple test should now produce audit logs in /var/log/audit/audit.log (I've taken the liberty and added a lot of line breaks to improve the readability):

server ~ # sh -c 'echo test'
test

server ~ # tail -f /var/log/audit/audit.log
[...]
type=SYSCALL msg=audit(1488219335.976:43871): arch=40000003 syscall=11 \
  success=yes exit=0 a0=ffdca3ec a1=f7760e58 a2=ffdd399c a3=ffdca068 items=2 \
  ppid=27771 pid=27800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 \
  fsgid=0 tty=pts7 ses=7532 comm="sh" exe="/bin/dash" key=(null)
type=EXECVE msg=audit(1488219335.976:43871): argc=3 a0="sh" a1="-c" \
  a2=6563686F2074657374
type=CWD msg=audit(1488219335.976:43871):  \
  cwd="/var/lib/iserv/remote-support/iserv-martin.von.wittich"
type=PATH msg=audit(1488219335.976:43871): item=0 name="/bin/sh" inode=10403900 \
  dev=08:01 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
type=PATH msg=audit(1488219335.976:43871): item=1 name=(null) inode=5345368 \
  dev=08:01 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
type=PROCTITLE msg=audit(1488219335.976:43871): \
  proctitle=7368002D63006563686F2074657374

Lots of the information is encoded, but ausearch can translate it with -i:

server ~ # ausearch -i -x /bin/dash | tail                                      
[...]
----
type=PROCTITLE msg=audit(27.02.2017 19:15:35.976:43871) : proctitle=sh 
type=PATH msg=audit(27.02.2017 19:15:35.976:43871) : item=1 name=(null) \
  inode=5345368 dev=08:01 mode=file,755 ouid=root ogid=root rdev=00:00 \
  nametype=NORMAL 
type=PATH msg=audit(27.02.2017 19:15:35.976:43871) : item=0 name=/bin/sh \
  inode=10403900 dev=08:01 mode=file,755 ouid=root ogid=root rdev=00:00 \
  nametype=NORMAL 
type=CWD msg=audit(27.02.2017 19:15:35.976:43871) :  \
  cwd=/var/lib/iserv/remote-support/iserv-martin.von.wittich 
type=EXECVE msg=audit(27.02.2017 19:15:35.976:43871) : argc=3 a0=sh a1=-c \
  a2=echo test 
type=SYSCALL msg=audit(27.02.2017 19:15:35.976:43871) : arch=i386 \
  syscall=execve success=yes exit=0 a0=0xffdca3ec a1=0xf7760e58 a2=0xffdd399c \
  a3=0xffdca068 items=2 ppid=27771 pid=27800 auid=root uid=root gid=root \
  euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts7 \
  ses=7532 comm=sh exe=/bin/dash key=(null) 
----

If you don't want to restrict the ausearch filtering to /bin/dash, you can also use ausearch -i -m ALL to translate the complete log. Another good filter would be ausearch -i -p <PID of a zombie process>, in this case ausearch -i -p 27800.

Just leave these rules in place until new zombie processes show up, and then search for the process creation of a zombie PID:

ausearch -i -p <PID>

This should be very helpful to identify the root cause of the zombie processes. In my case it was a PHP script that used proc_open to spawn a Perl script without closing the handle with proc_close.

Linux – What happends when sending SIGKILL to a Zombie Process in Linux

To answer that question, you have to understand how signals are sent to a process and how a process exists in the kernel.

Each process is represented as a task_struct inside the kernel (the definition is in the sched.h header file and begins here). That struct holds information about the process; for instance the pid. The important information is in line 1566 where the associated signal is stored. This is set only if a signal is sent to the process.

A dead process or a zombie process still has a task_struct. The struct remains, until the parent process (natural or by adoption) has called wait() after receiving SIGCHLD to reap its child process. When a signal is sent, the signal_struct is set. It doesn't matter if the signal is a catchable one or not, in this case.

Signals are evaluated every time when the process runs. Or to be exact, before the process would run. The process is then in the TASK_RUNNING state. The kernel runs the schedule() routine which determines the next running process according to its scheduling algorithm. Assuming this process is the next running process, the value of the signal_struct is evaluated, whether there is a waiting signal to be handled or not. If a signal handler is manually defined (via signal() or sigaction()), the registered function is executed, if not the signal's default action is executed. The default action depends on the signal being sent.

For instance, the SIGSTOP signal's default handler will change the current process's state to TASK_STOPPED and then run schedule() to select a new process to run. Notice, SIGSTOP is not catchable (like SIGKILL), therefore there is no possibility to register a manual signal handler. In case of an uncatchable signal, the default action will always be executed.

To your question:

A defunct or dead process will never be determined by the scheduler to be in the TASK_RUNNING state again. Thus the kernel will never run the signal handler (default or defined) for the corresponding signal, whichever signal is was. Therefore the exit_signal will never be set again. The signal is "delivered" to the process by setting the signal_struct in task_struct of the process, but nothing else will happen, because the process will never run again. There is no code to run, all that remains of the process is that process struct.

However, if the parent process reaps its children by wait(), the exit code it receives is the one when the process "initially" died. It doesn't matter if there is a signal waiting to be handled.

Best Answer

Related Solutions

Way to identify which process turns into Zombie process

Linux – What happends when sending SIGKILL to a Zombie Process in Linux

Related Question