Getting WAN IP 0.0.0.0 after some time

internetwan

I hope this problem belongs here and you will be able to help me (otherwise please migrate the question to Superuser).

Background and problem description

I am connected to the internet via my cable TV provider. My connection looks like this: Thomson cable modem <> Asus RT-N66U router <>Wired and wireless clients. Every time I manually connect/reconnect the cable between my modem and router the connection comes back. After some time (usually, but not always, 6 hours, while lease time router reports in the interface is 12 hours, while lease time router reports in the interface is 12 hours) the connection status is still "connected", but on the router the WAN IP gets 0.0.0.0. As a result, there is no internet access, and also no access to my home server from WAN. But my ISP still claims that the modem is perfectly visible from WAN, it gets the IP properly. Whenever I restart my router (either from its console or using the physical switch) or I disconnect and then reconnect theble between modem and router, everything comes back to normal. LAN works perfectly all the time.

BTW, I don't think it matters, but… Not long ago, my ISP put me behind a double NAT, but as I need a public IP to access my home server, I contacted them to bring it back to the previous state. It worked like a bliss and my DS was accessible from WAN again, either via IP:port or DDNS:port.

Below, there are some details regarding my network environment:

Modem

  • It gets a public, non-static IP (however, it is semi-static, so that I have one IP for days, weeks or even months).
  • It runs in bridge mode, with all the ports totally transparent.
  • I have no access to its web interface – it is always configured by ISP.

Router

  • It is the only DHCP server for my LAN,
  • The modem is connected to the router over the router's vlan2, while vlan1 is my LAN,
  • There is a port forwading rule on my router, that allows to access the web panel of DS (Diskstation): two custom ports, namely: 666 and 999 are forwarded to IP:192.168.1.3, one is used for HTTP, another one for HTTPS, plus, I have forwarded the 80 port to it as well.
  • The DS MAC is binded to IP:192.168.1.3,
  • There is no MAC cloning – the MAC my ISP has is the router's original MAC,
  • IPv6 is disabled

LAN clients are:

  • wirelessly connected computers (MBA, Windows), phones and tablet (iPhone, iPad, Windows Phone, Android Nexus 5),
  • Synology Diskstation DS211j, which is runing their new software, DSM v 5.0 (it is configured to obtain IP automatically from router's DHCP) ,
  • a Samsung TV,
  • a Sony CMT-G2NiP audio system.

What I have tried so far – without success:

  • turning off all the devices and then turning them on again (all the options with disconnecting the power cord and leaving them off for longer times included),
  • setting DMZ on modem IP,
  • setting DMZ on DS IP,
  • enabling and then disabling UPnP on the router's WAN,
  • tweaking router's MTU from 1500 to 1492,
  • changing DHCP query frequency from Agressive to Normal,
  • disabling router's firewall,
  • disabling DDNS on DS,
  • changing DNS on router to Google ones (one thing that is strange here is that when set to automatically obtain it, is that the primary DNS address is also 0.0.0.0, while secondary is ok),
  • restoring router settings to factory defaults,
  • 30-30-30 reset of the router,
  • flashing router a 3rd party modified firmware (Merlin build 374.40 – currently installed),
  • enabling dual WAN in failover mode – thinking that maybe router will try to connect to second WAN after the primary one disconnects and as there is no such, it will re-establish the connection on primary one,
  • switching physical ports devices are connected to on my modem and router.

Some log entries that I think may play a role here:

  • Apr 6 17:05:47 dhcp client: bound 0.0.0.0 via 109.173.192.1 during 43200 seconds.
  • Apr 6 18:47:28 kernel: eth1: received packet with own address as source address

Some more settings I think may have something in common with this:

  • WAN connection type is Automatic IP (means: DHCP I think).
  • NAT on WAN is enabled, and UPnP is disabled.

TCP settings (accessible in Merlin's build):

  • TCP Timeout: Established: 1200
  • TCP Timeout: syn_sent: 120
  • TCP Timeout: syn_recv: 60
  • TCP Timeout: fin_wait: 120
  • TCP Timeout: time_wait: 120
  • TCP Timeout: close: 10
  • TCP Timeout: close_wait: 60
  • TCP Timeout: last_ack: 30
  • UDP Timeout: Assured: 180
  • UDP Timeout: Unreplied: 30

WAN NAT passthrough rules:

  • PPTP Passthrough: Enabled
  • L2TP Passthrough: Enabled
  • IPSec Passthrough: Enabled
  • RTSP Passthrough: Enabled
  • H.323 Passthrough: Enabled
  • SIP Passthrough: Enabled
  • Enable PPPoE Relay: Disabled

Does anyone have any idea how to get out of this?

(Should there be more info necessary, please indicate it in comments, I will update the question text.)

Best Answer

This is your problem:

  • Apr 6 18:47:28 kernel: eth1: received packet with own address as source address

Or, rather, it is a report of such a Very Bad Thing that it trumps all other issues. With such a condition extant, it isn't unfortunate that your connection drops after 6 hours, but rather a miracle that it even works at all. (I suspect its DHCP server might be dormant until it receives its IP from the ISP's DHCP server, leading me to suspect there's a rogue server on your VLAN 2)

Your router's DHCP server appears to be answering its own DHCP REQUEST: At 1/2 DHCP Lease time, a client will attempt to renew its lease, which explains the 6 hour break-down.

I don't have any experience with your particularl router, so I can't prescribe a specific remedy, but it appears that its DHCP client is operating on the same VLAN as its DHCP server, and answering its own request; another possibility is that perhaps link-level broadcasts are somehow crossing VLANs.

The error above is a show-stopper, really. This situation ought to never, ever happen. Were it not confined to DHCP packets, such a loop condition can cripple a switch, when the offending ethernet frames are cloned and re-transmitted, eventually saturating the switching fabric. Note that it is not an IP level error, but a MAC/Layer 2 error.

I hope this helps...

Related Question