Linux Networking – Multiple Default Gateways for Outbound Connections

debiangatewayiproutelinuxnetworking

I would like to have multiple NICs (eth0 and wlan0) in the same subnet and to serve as a backup for the applications on the host if one of the NICs fail. For this reason I have created an additional routing table. This is how /etc/network/interfaces looks:

iface eth0 inet static
address 192.168.178.2
netmask 255.255.255.0
dns-nameserver 8.8.8.8 8.8.4.4
    post-up ip route add 192.168.178.0/24 dev eth0 src 192.168.178.2
    post-up ip route add default via 192.168.178.1 dev eth0
    post-up ip rule add from 192.168.178.2/32
    post-up ip rule add to 192.168.178.2/32

iface wlan0 inet static
wpa-conf /etc/wpa_supplicant.conf
wireless-essid xyz
address 192.168.178.3
netmask 255.255.255.0
dns-nameserver 8.8.8.8 8.8.4.4
    post-up ip route add 192.168.178.0/24 dev wlan0 src 192.168.178.3 table rt2
    post-up ip route add default via 192.168.178.1 dev wlan0 table rt2
    post-up ip rule add from 192.168.178.3/32 table rt2
    post-up ip rule add to 192.168.178.3/32 table rt2

That works for connecting to the host: I can still SSH into it if one of the interfaces fails. However, the applications on the host cannot initialize a connection to the outside world if eth0 is down. That is my problem.

I have researched that topic and found the following interesting information:

When a program initiates an outbound connection it is normal for it to
use the wildcard source address (0.0.0.0), indicating no preference as
to which interface is used provided that the relevant destination
address is reachable. This is not replaced by a specific source
address until after the routing decision has been made. Traffic
associated with such connections will not therefore match either of
the above policy rules, and will not be directed to either of the
newly-added routing tables. Assuming an otherwise normal
configuration, it will instead fall through to the main routing table.
http://www.microhowto.info/howto/ensure_symmetric_routing_on_a_server_with_multiple_default_gateways.html

What I want is for the main route table to have more than one default gateway (one on eth0 and one on wlan0) and to go to the default gateway via eth0 by default and via wlan0 if eth0 is down.

Is that possible? What do I need to do to achieve such a functionality?

Best Answer

Solved it myself. There seems to be very little information about the networking stuff that you can do with Linux, so I have decided to document and explain my solution in detail. This is my final setup:

  • 3 NICs: eth0 (wire), wlan0 (built-in wifi, weak), wlan1 (usb wifi adapter, stronger signal than wlan0)
  • All of them on a single subnet, each of them with their own IP address.
  • eth0 should be used for both incoming and outgoing traffic by default.
  • If eth0 fails then wlan1 should be used.
  • If wlan1 fails then wlan0 should be used.

First step: Create a new route table for every interface in /etc/iproute2/rt_tables. Let's call them rt1, rt2 and rt3

#
# reserved values
#
255 local
254 main
253 default
0 unspec
#
# local
#
#1  inr.ruhep
1 rt1
2 rt2
3 rt3

Second step: Network configuration in /etc/network/interfaces. This is the main part and I'll try to explain as much as I can:

auto eth0 wlan0
allow-hotplug wlan1

iface lo inet loopback

iface eth0 inet static
address 192.168.178.99
netmask 255.255.255.0
dns-nameserver 8.8.8.8 8.8.4.4
    post-up ip route add 192.168.178.0/24 dev eth0 src 192.168.178.99 table rt1
    post-up ip route add default via 192.168.178.1 dev eth0 table rt1
    post-up ip rule add from 192.168.178.99/32 table rt1
    post-up ip rule add to 192.168.178.99/32 table rt1
    post-up ip route add default via 192.168.178.1 metric 100 dev eth0
    post-down ip rule del from 0/0 to 0/0 table rt1
    post-down ip rule del from 0/0 to 0/0 table rt1

iface wlan0 inet static
wpa-conf /etc/wpa_supplicant.conf
wireless-essid xyz
address 192.168.178.97
netmask 255.255.255.0
dns-nameserver 8.8.8.8 8.8.4.4
    post-up ip route add 192.168.178.0/24 dev wlan0 src 192.168.178.97 table rt2
    post-up ip route add default via 192.168.178.1 dev wlan0 table rt2
    post-up ip rule add from 192.168.178.97/32 table rt2
    post-up ip rule add to 192.168.178.97/32 table rt2
    post-up ip route add default via 192.168.178.1 metric 102 dev wlan0
    post-down ip rule del from 0/0 to 0/0 table rt2
    post-down ip rule del from 0/0 to 0/0 table rt2

iface wlan1 inet static
wpa-conf /etc/wpa_supplicant.conf
wireless-essid xyz
address 192.168.178.98
netmask 255.255.255.0
dns-nameserver 8.8.8.8 8.8.4.4
    post-up ip route add 192.168.178.0/24 dev wlan1 src 192.168.178.98 table rt3
    post-up ip route add default via 192.168.178.1 dev wlan1 table rt3
    post-up ip rule add from 192.168.178.98/32 table rt3
    post-up ip rule add to 192.168.178.98/32 table rt3
    post-up ip route add default via 192.168.178.1 metric 101 dev wlan1
    post-down ip rule del from 0/0 to 0/0 table rt3
    post-down ip rule del from 0/0 to 0/0 table rt3

If you type ip rule show you should see the following:

0:  from all lookup local 
32756:  from all to 192.168.178.98 lookup rt3 
32757:  from 192.168.178.98 lookup rt3 
32758:  from all to 192.168.178.99 lookup rt1 
32759:  from 192.168.178.99 lookup rt1 
32762:  from all to 192.168.178.97 lookup rt2 
32763:  from 192.168.178.97 lookup rt2 
32766:  from all lookup main 
32767:  from all lookup default 

This tells us that traffic incoming or outgoing from the IP address "192.168.178.99" will use the rt1 route table. So far so good. But traffic that is locally generated (for example you want to ping or ssh from the machine to somewhere else) needs special treatment (see the big quote in the question).

The first four post-up lines in /etc/network/interfaces are straightforward and explanations can be found on the internet, the fifth and last post-up line is the one that makes magic happen:

post-up ip r add default via 192.168.178.1 metric 100 dev eth0

Note how we haven't specified a route-table for this post-up line. If you don't specify a route table, the information will be saved in the main route table that we saw in ip rule show. This post-up line puts a default route in the "main" route table that is used for locally generated traffic that is not a response to incoming traffic. (For example an MTA on your server trying to send an e-mail.)

The three interfaces all put a default route in the main route table, albeit with different metrics. Let's take a look a the main route table with ip route show:

default via 192.168.178.1 dev eth0  metric 100 
default via 192.168.178.1 dev wlan1  metric 101 
default via 192.168.178.1 dev wlan0  metric 102 
192.168.178.0/24 dev wlan0  proto kernel  scope link  src 192.168.178.97 
192.168.178.0/24 dev eth0  proto kernel  scope link  src 192.168.178.99 
192.168.178.0/24 dev wlan1  proto kernel  scope link  src 192.168.178.98

We can see that the main route table has three default routes, albeit with different metrics. The highest priority is eth0, then wlan1 and then wlan0 because lower metric numbers indicate a higher priority. Since eth0 has the lowest metric this is the default route that is going to be used for as long as eth0 is up. If eth0 goes down, outgoing traffic will switch to wlan1.

With this setup we can type ping 8.8.8.8 in one terminal and ifdown eth0 in another. ping should still work because because ifdown eth0 will remove the default route related to eth0, outgoing traffic will switch to wlan1.

The post-down lines make sure that the related route tables get deleted from the routing policy database (ip rule show) when the interface goes down, in order to keep everything tidy.

The problem that is left is that when you pull the plug from eth0 the default route for eth0 is still there and outgoing traffic fails. We need something to monitor our interfaces and to execute ifdown eth0 if there's a problem with the interface (i.e. NIC failure or someone pulling the plug).

Last step: enter ifplugd. That's a daemon that watches interfaces and executes ifup/ifdown if you pull the plug or if there's problem with the wifi connection /etc/default/ifplugd:

INTERFACES="eth0 wlan0 wlan1"
HOTPLUG_INTERFACES=""
ARGS="-q -f -u0 -d10 -w -I"
SUSPEND_ACTION="stop"

You can now pull the plug on eth0, outgoing traffic will switch to wlan1 and if you put the plug back in, outgoing traffic will switch back to eth0. Your server will stay online as long as any of the three interfaces work. For connecting to your server you can use the ip address of eth0 and if that fails, the ip address of wlan1 or wlan0.

Related Question