Networking – What causes dropping of ARP response packets in a wireless network

arpnetworkingpingwireless-networkingwireshark

I have a network of wireless access points (APs) in my local area network (LAN).

Some PCs in the network can get ping responses from some other PCs/devices in the network but not some others. I have not found a reliable pattern but in brief it may be something like this:

Say we have a computer Alice, a Wifi AP Bob, and another Wifi AP/device Charlie.

Alice can ping to Bob, Bob can ping to Charlie, but Alice cannot ping to Charlie.
("ping" meaning able to get ping responses)
I have already disabled all firewalls, and allowed all ICMP responses.

With the help of Wireshark and tcpdump, I have deduced that the ARP request (opcode 1) packet from Alice was able to reach the intended destination Charlie, and Charlie sent back the ARP response (opcode 2) packet which did not reach Alice.

What could be the possible technical shortcomings resulting in such an error?

How can I debug this situation?

Assuming that I have some programmatic control because I am using OpenWRT, how can I solve this problem?

The funny thing is that when I changed the name of my Windows 8 PC, this problem was rectified. Not sure if this is a case of post hoc ergo propter hoc.

Update: The APs/devices/PCs are on the same subnet, linked using bridge mode.

Best Answer

While I'm not super familiar with advanced configuration on OpenWRT (it's on my to-do-list for geek projects), my first piece of advice would be to ensure that you're not doing NAT on "Bob". If Alice were on the LAN side of a WAP, and Charlie was on the WAN side, then Alice would be able to ping Charlie but not vice versa. That's the inherent firewall that NAT provides.

For this not to be the case, all of your APs have to be operating in some form of "bridge" mode or "access point" mode. This means that the device acts as more or less a packet forwarder - it does not do any routing or packet inspection of its own. The easiest way to achieve this on cheaper routers is to disable the DHCP server in the router and then connect one of the LAN ports to your network (and also ensure that the router's LAN IP won't conflict with your actual gateway). You would leave the WAN port hanging. If the router complains (most don't but some do), set the Internet connection to a static IP and use something like 223.255.255.254 with a 255.255.255.252 subnet mask for the address and 223.255.255.253 for gateway. (Trivia: that's the last Class C subnet of the smallest possible size.)

The other possibility could be a subnet mask mismatch. Every computer on the same network needs to have the same subnet mask configured (along with being in the same actual network of course.) The computer uses the subnet mask to determine not only the IP broadcast address but also to determine whether broadcast packets received should be processed by the network stack (i.e. if broadcast packets for IPs that would fall outside the address and subnet mask configured on the device, many devices will ignore the packet.)

Hope this helps at least somewhat.

Related Question