Debian – Unable to kill process with PID 1 in docker container

debiandockerkillprocess

I have the following Dockerfile for creating a container with a powerdns recursor in it:

FROM debian:stretch-slim
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && \
    apt-get install --no-install-recommends -y \
    pdns-recursor && \
    rm -rf /var/lib/apt/lists/* && \
    apt-get clean
COPY ./configuration/recursor.conf /etc/powerdns/recursor.conf
RUN chown -R :pdns /etc/powerdns/ && \
    chmod 0750 /etc/powerdns/ && \
    chmod 0640 /etc/powerdns/recursor.conf
EXPOSE 8699
ENTRYPOINT ["/usr/sbin/pdns_recursor", "--daemon=no"]

My recursor.conf looks like this:

config-dir=/etc/powerdns
forward-zones=resolver1.opendns.com=208.67.222.222
hint-file=/usr/share/dns/root.hints
local-address=0.0.0.0
local-port=8699
quiet=yes
security-poll-suffix=
setgid=pdns
setuid=pdns

IPv6 is disabled on the hypervisor.

The problem is that docker is not able to stop the container properly with docker stop recursor. After some time the OOMKiller terminates the programm with the following information:

Exited (137) 2 seconds ago

I searched the web and the signals 128 + 9 = 137 mean that I don't have sufficient RAM, what is simply not the case. When I execute docker exec -it recursor /bin/bash and try to kill PID 1 (kill -9 -- 1) within the container I don't get any reaction – the service simply continues to run as if nothing happened.

I also tried to start the recursor in daemon-mode – same result.

Does anyone has an idea why that is so?

Best Answer

Process with PID 1 is the init process. That stays true in a pid namespace or a container: this pid 1 cannot be killed with SIGKILL because it has no KILL signal handler defined, contrary to any other userland process.

If you really want to kill it, you have to kill it from the host. Running on the host (with enough privileges, probably root):

kill -KILL $(docker inspect --format '{{.State.Pid}}' containername)

This will bring down the whole container since removing its PID 1 means stopping the container. Please note that I answered to the title of the question, but not to the underlying problem: what is causing OOM.

UPDATE: probably simplier to use docker kill, which defaults to the KILL signal. That would be:

docker kill containername

UPDATE2: convince that PID 1 cannot be killed with SIGKILL (aka -9), even in a container (the example requires user namespace enabled else just use unshare --mount-proc --fork --pid as root).

first terminal:

$ unshare --map-root-user --mount-proc --fork --pid
# echo $$
1
# pstree -p
bash(1)---pstree(88)
# kill -9 1
#

no effect

On a second terminal:

$ pstree -p $(pidof unshare)
unshare(2023)───bash(2024)
$ kill -9 2024

first terminal:

# Killed
$ 
Related Question