How to avoid dhcpcd sabotaging docker0 network

dhcpcddockernetworking

How to reproduce:

  1. Start dhcpcd unless it's already running
  2. Restart the docker daemon
  3. Run docker run -it busybox ping -c 1 1.1.1.1

Expected behaviour: The command should succeed.

Actual behaviour: The command reports 100% packet loss.

Workaround:

Stopping the dhcpcd service avoids this issue, but is of course not a viable option.

The relevant line from the dhcpcd journal:

docker0: removing interface

Details:

$ docker --version
Docker version 18.09.6-ce, build 481bc77156
$ dhcpcd --version
dhcpcd 7.2.1
Copyright (c) 2006-2019 Roy Marples
Compiled in features: INET ARP ARPing IPv4LL INET6 DHCPv6 AUTH
$ systemctl --version
systemd 242 (242.29-1-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid
$ uname --kernel-name --kernel-release --kernel-version --machine --processor --hardware-platform --operating-system
Linux 5.1.9-arch1-1-ARCH #1 SMP PREEMPT Tue Jun 11 16:18:09 UTC 2019 x86_64 unknown unknown GNU/Linux

tl;dr workaround: enable dhcpcd per interface rather than globally.

I have no good explanation for why this happens. Hopefully somebody else will look into whether this is a dhcpcd bug, docker bug, or otherwise answer why this would have started breaking today after using the latest dhcpcd version for six weeks and the last docker version for ten days (according to /var/log/pacman.log). I did docker swarm init yesterday and docker swarm leave --force* today, but otherwise I can't think of any recent related changes.

A workaround is to let dhcpcd explicitly manage only those interfaces which need DHCP. First disable dhcpcd globally:

sudo systemctl stop dhcpcd
sudo systemctl disable dhcpcd

Then, per interface that you actually use to connect to the Internet (see ip link for a list):

sudo systemctl enable dhcpcd@INTERFACE
sudo systemctl start dhcpcd@INTERFACE

* --force because although I had done nothing else with docker swarm in the meantime it wouldn't let me leave:

Error response from daemon: You are attempting to leave the swarm on a node that is participating as a manager. Removing the last manager erases all current state of the swarm. Use --force to ignore this message.

Working my way down the stack, removing the node didn't work:

Error response from daemon: rpc error: code = FailedPrecondition desc = node shbgqin74wuljj9skxt6y6ej7 is a cluster manager and is a member of the raft cluster. It must be demoted to worker before removal

And demoting didn't work either:

Error response from daemon: rpc error: code = FailedPrecondition desc = attempting to demote the last manager of the swarm

Best Answer

Late to the party, but some may find useful:

https://www.daemon-systems.org/man/dhcpcd.conf.5.html

denyinterfaces

         When discovering interfaces, the interface name must not match
         pattern which is a space or comma separated list of patterns
         passed to fnmatch(3).

so presumably you can do veth* and be done with it

Related Question