Kernel: 5.5.8-arch1-1
I am trying to get virtual networking working using a bridge attached to my physical interface. This is a typical setup, I'm not even trying to do anything weird.
- Bridge:
br0
- Phys interface:
enp6s0f0
The problem is that Linux isn't forwarding any IP traffic out the physical interface. It's forwarding ARP traffic both ways since ARP resolution works, but no IP traffic gets sent out of enp6s0f0.
Things I've tried:
- adding
enp6s0f1
to the bridge, givingenp7s0f0
to the VM, and using a cable to linkenp7s0f0
toenp6s0f1
- same result (ARP traffic forwarded, IP traffic not)
- stopping docker and flushing all tables
- no change
- disabling rp_filter
- no change
- using the onboard NIC
- no change (this was actually the initial setup and I dropped this quad-port card in to see if was the onboard NIC causing a problem)
- pinging the VM from another machine
- I could see the echo-request come in and I could see it on
br0
but it was not forwarded to the VM port (either the vnet port orenp6s0f1
)
- I could see the echo-request come in and I could see it on
- enabling STP on the bridge (it was initially disabled)
- no change
○ → ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp6s0f0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc mq master br0 state UP group default qlen 1000
link/ether 00:10:18:85:1c:c0 brd ff:ff:ff:ff:ff:ff
inet6 fe80::210:18ff:fe85:1cc0/64 scope link
valid_lft forever preferred_lft forever
3: enp6s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:10:18:85:1c:c2 brd ff:ff:ff:ff:ff:ff
4: enp7s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:10:18:85:1c:c4 brd ff:ff:ff:ff:ff:ff
5: enp7s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:10:18:85:1c:c6 brd ff:ff:ff:ff:ff:ff
6: enp9s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b4:2e:99:a6:22:f9 brd ff:ff:ff:ff:ff:ff
7: wlp8s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:71:90:4e:e9:77 brd ff:ff:ff:ff:ff:ff
8: br-183e1a17d7f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:ba:03:e1:9d brd ff:ff:ff:ff:ff:ff
inet 172.18.0.1/16 brd 172.18.255.255 scope global br-183e1a17d7f6
valid_lft forever preferred_lft forever
9: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:02:61:00:66 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
10: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:10:18:85:1c:c0 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.205/24 brd 192.168.1.255 scope global dynamic noprefixroute br0
valid_lft 9730sec preferred_lft 7930sec
inet6 fe80::210:18ff:fe85:1cc0/64 scope link
valid_lft forever preferred_lft forever
11: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UNKNOWN group default qlen 1000
link/ether fe:54:00:be:eb:3e brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:febe:eb3e/64 scope link
valid_lft forever preferred_lft forever
○ → brctl showstp br0
br0
bridge id 8000.001018851cc0
designated root 1000.44e4d9d88a00
root port 1 path cost 4
max age 19.99 bridge max age 19.99
hello time 1.99 bridge hello time 1.99
forward delay 14.99 bridge forward delay 14.99
ageing time 299.99
hello timer 0.00 tcn timer 0.00
topology change timer 0.00 gc timer 25.78
flags
enp6s0f0 (1)
port id 8001 state forwarding
designated root 1000.44e4d9d88a00 path cost 4
designated bridge 1000.44e4d9d88a00 message age timer 19.21
designated port 800d forward delay timer 0.00
designated cost 0 hold timer 0.00
flags
vnet0 (2)
port id 8002 state forwarding
designated root 1000.44e4d9d88a00 path cost 100
designated bridge 8000.001018851cc0 message age timer 0.00
designated port 8002 forward delay timer 0.00
designated cost 4 hold timer 0.22
flags
○ → bridge -d link show
2: enp6s0f0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 4
hairpin off guard off root_block off fastleave off learning on flood on mcast_flood on mcast_to_unicast off neigh_suppress off vlan_tunnel off isolated off enp6s0f0
8: br-183e1a17d7f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 master br-183e1a17d7f6 br-183e1a17d7f6
9: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 master docker0 docker0
10: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 br0
11: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 100
hairpin off guard off root_block off fastleave off learning on flood on mcast_flood on mcast_to_unicast off neigh_suppress off vlan_tunnel off isolated off vnet0
○ → sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
○ → sysctl net.ipv4.conf.br0.forwarding
net.ipv4.conf.br0.forwarding = 1
Best Answer
It appears that, probably because of an iptables rule from Docker, you had the module
br_netfilter
loaded and active (ie:sysctl net.bridge.bridge-nf-call-iptables
returns 1). This makes bridged frames (Ethernet, layer 2) subject to iptables filtering (IP, layer 3):For example this module gets automatically loaded whenever an iptables with the
physdev
match is in use, even in an other network namespace.There is documentation explaining side effects caused by this module. Those side effects are intended when using it for bridge transparent firewalling. Also, the iptables
physdev
match cannot work properly without it (it simply won't match anymore). It's also explained how to prevent its effects, especially in chapter 7:Rather than disabling on iptables this module like this:
one should adapt its iptables rules as explained in chapter 7 to avoid side effets. Else other unknown parts of the system will be disrupted.
Until recently in kernel 5.3 this module was not namespace aware and having it loaded suddenly enabled it on all network namespaces causing all kind of troubles when unexpected. It's also since this then that it's also possible to enable it per bridge (
ip link set dev BRIDGE type bridge nf_call_iptables 1
) rather than per namespace.Once tools (Docker...) and kernel (>= 5.3) follow evolution, simply having it enabled in select network namespaces and bridges should suffice, but today probably not. Also note that kernel 5.3 also inherited native bridge stateful firewalling, usable by nftables, probably turning this module soon obsolete (once direct encapsulation/decapsulation support in bridge for VLAN and PPPoE are available):