I'm working with an embedded Debian system and I'm having trouble getting ethernet working consistently. Once every 5 or 10 times that eth0 is brought up something fails and I can't connect to it via ssh and it doesn't respond to ping. The solution is to either reboot or log in via serial console and bring eth0 down and then up again. I can replicate the problem either by repeatedly rebooting or by issuing ifconfig eth0 down && ifconfig eth0 up
repeatedly until the device stops responding.
My /etc/network/interfaces is:
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet static
address 192.168.1.122
gateway 192.168.1.1
netmask 255.255.255.0
When networking works dmesg
says:
[ 2612.775183] PHY found at addr 7
[ 2612.776944] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 2614.414704] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
And when it doesn't dmesg
says:
[ 2617.224970] PHY found at addr 7
[ 2617.227005] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
When networking works ifconfig
outputs:
eth0 Link encap:Ethernet HWaddr 00:d0:69:46:d9:08
inet addr:192.168.1.122 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::2d0:69ff:fe46:d908/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1528 Metric:1
RX packets:3242 errors:0 dropped:0 overruns:0 frame:0
TX packets:1382 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:300701 (293.6 KiB) TX bytes:132344 (129.2 KiB)
Interrupt:22
And when it doesn't ifconfig
output is:
eth0 Link encap:Ethernet HWaddr 00:d0:69:46:d9:08
inet addr:192.168.1.122 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST MULTICAST MTU:1536 Metric:1
RX packets:3355 errors:0 dropped:0 overruns:0 frame:0
TX packets:1430 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:310120 (302.8 KiB) TX bytes:136800 (133.5 KiB)
Interrupt:22
When networking works ip link show eth0
outputs:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1528 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
link/ether 00:d0:69:46:d9:08 brd ff:ff:ff:ff:ff:ff
When things don't work ip link show eth0
gives:
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1536 qdisc pfifo_fast state DOWN mode DEFAULT qlen 1000
link/ether 00:d0:69:46:d9:08 brd ff:ff:ff:ff:ff:ff
My current solution is to have a script parse the output of ip link show eth0
and restart eth0 until it comes up, but this seems pretty hacky.
Any idea what the problem might be or where else I should be looking?
Edit:
Output from ethtool eth0
when things work:
Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: Yes
Speed: 100Mb/s
Duplex: Full
Port: MII
PHYAD: 7
Transceiver: internal
Auto-negotiation: on
Link detected: yes
Output from ethtool eth0
when it doesn't:
Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Speed: 10Mb/s
Duplex: Half
Port: MII
PHYAD: 7
Transceiver: internal
Auto-negotiation: on
Link detected: no
I also imaged the system I've been working on and tested on a second identical machine, but with different cables and a different router and saw the same behavior.
Edit 2:
Per ttsiodras' observation I did some MTU testing. I found that when the device boots the MTU is initially 1508. Every time I bring eth0 down then back up the MTU increases by 4, to a maximum of 1540 after which point it stays the same. Unfortunately there didn't seem to be any correlation between MTU and when I would lose network connectivity. I also tried manually setting the MTU to a variety of values between 1508 and 1540 and the network would still occasionally fail regardless of the manual MTU setting.
Best Answer
This may be related to the fact that Debian patches systemd slightly for backwards compatibility. That's a workaround, however, and one that is somewhat problematic; the full story can be found in the Debian wiki page on the subject. The goal is to fix this for Stretch (the next Debian release) by adding systemd-specific code to the packages containing
rcS
init scripts. Most of the work there has been done, but there's still a minor amount left.Things which may be able to fix this issue:
rc.local
which checks if the most importantrcS
scripts (for your situation) ran successfully, and which fixes things if not (runningsystemctl status foo.service
may help here)systemd
on your systems bysysvinit
(although that may be overkill)ifupdown
.