Linux – How to configure a large mtu (linux)

ipv6linuxmtu

I have a gigabit ethernet connection from my laptop to my router, and a working ipv6 connection to the internet. ~~I can receive very large packets from sites on the internet, with sizes up to at least 10000 bytes (according to wireshark).~~ (edit: turns out to be linux's 'generic receive offload') However, when trying to send anything, my local computer fragments at just below 1500 bytes for ipv6. (On ipv4, I can send tcp packets to the internet of at least 1514 bytes, I can ping with packets up to the configured mtu of 6128 but they are blackholed.)

I'm on ubuntu 12.04. I have configured an mtu for my eth0 of 6128 (the maximum it accepts), both using ip link set dev eth0 mtu 6128 and in the NetworkManager applet gui, and restarted the connection. ip link show eth0 shows the 6128 mtu is indeed set. ip -6 route shows that none of the paths the kernel knows about have an mtu set. I can ping over ipv4 with packets up to 6128 bytes (though I don't get responses), but when I do ping6 myrouter -c3 -s1500 -Mdo I get error replies from my own computer saying that the packets are too large and the mtu is 1480. I have confirmed with Wireshark that nothing is put on the wire, and the replies are indeed generated by my own computer.

So, how do I get my computer to use the larger mtu?

Best Answer

What you are seeing are most likely not jumbo frames. Something like 99.9% of the Internet runs on a 1500 byte and lower MTU after all. It is probably just your kernel or network card doing coalescing of packets.

It does this using a feature usually called Generic Recieve Offload (GRO) or Large Receieve Offload (LRO). The way this works is that packets within a single flow gets identified and merged, then fed to the TCP/IP stack. This can save an significant amount of CPU cycles as it reduces the amount of round trips into the stack.

Try this: ethtool -K $INTERFACE gro off

Which turns off this feature and makes wireshark happier (though not your CPU)

You could still use higher MTU's locally, but it doesnt buy you very much anymore precisely due to features like this and of course ever faster hardware. Also it can be a management nightmare. There are lots of buggy drivers and hardware, and varying degree of support setting MTU through DHCP or RA in operating systems. As you want all devices in a given broadcast domain to be running the same MTU this often makes jumbo frames impractical.

Windows 7, Windows Vista

To show current MTU on Windows 7 or Windows Vista, from a command prompt:

C:\Users\Ian>netsh interface ipv6 show subinterfaces

       MTU  MediaSenseState   Bytes In  Bytes Out  Interface
----------  ---------------  ---------  ---------  -------------
      1280                1   24321220    6455865  Local Area Connection
4294967295                1          0    1060111  Loopback Pseudo-Interface 1
      1280                5          0          0  isatap.newland.com
      1280                5          0          0  6TO4 Adapter

And for IPv4 interfaces:

C:\Users\Ian>netsh interface ipv4 show subinterfaces

       MTU  MediaSenseState   Bytes In  Bytes Out  Interface
----------  ---------------  ---------  ---------  -------------
      1500                1  146289608   29200474  Local Area Connection
4294967295                1          0      54933  Loopback Pseudo-Interface 1

Note: In this example my Local Area Connection IPv6 interface has such a low MTU (1280) because i'm using a tunnel service to get IPv6 connectivity.

You can also change your MTU (Windows 7, Windows Vista). From an elevated command prompt:

>netsh interface ipv4 set subinterface "Local Area Connection" mtu=1492 store=persistent
Ok.

Tested with Windows 7 Service Pack 1

Windows XP

The netsh syntax for Windows XP is slightly different:

C:\Users\Ian>netsh interface ip show interface

Index:                                  1
User-friendly Name:                     Loopback
Type:                                   Loopback
MTU:                                    32767
Physical Address:                       

Index:                                  2
User-friendly Name:                     Local Area Connection
Type:                                   Etherenet
MTU:                                    1500
Physical Address:                       00-03-FF-D9-28-B7

Note: Windows XP requires that the Routing and Remote Access service be started before you can see details about an interface (including MTU):

C:\Users\Ian>net start remoteaccesss

Windows XP does not provide a way to change the MTU setting from within netsh. For that you can:

follow the instructions in KB283165 - How to change the PPPoE MTU size in Windows XP
use Dr. TCP (Note: Windows 2000/XP only)

Tested with Windows XP Service Pack 3

+------------------------+
| 12 bytes control flags | \
| 4 byte from address    | |
| 4 byte to address      | |- IP and ICMP header: 28 bytes
|------------------------| |
| 8 byte ICMP header     | /
|------------------------|
| 1472 byte payload      |
|                        |
|                        |
|                        |
+------------------------+

That's where the "missing" 28 bytes is - it's the size of the headers required to send a ping packet.

When you send a ping packet, you can specify how much extra payload data you'd like to include. In this case, if you include all 1472 bytes:

>ping -l 1472 obsidian

Then the resulting ethernet packet will be full to the gills. Every last byte of the 1500 byte packet will be filled:

+------------------------+
| 12 bytes control flags | \
| 4 byte from address    | |
| 4 byte to address      | |- IP and ICMP header: 28 bytes
|------------------------| |
| 8 byte ICMP header     | /
|------------------------|
|........................|
|........................|
|. 1472 bytes of junk....|
|........................|
|........................|
|........................|
|........................|
+------------------------+

If you try to send one more byte

>ping -l 1473 obsidian

the network will have to fragment that 1501 byte packet into multiple packets:

Packet 1 of 2
+------------------------+
| 20 bytes control flags | \
| 4 byte from address    | |
| 4 byte to address      | |- IP and ICMP header: 28 bytes
|------------------------| |
| 8 byte ICMP header     | /
|------------------------|
|........................|
|........................|
|..1472 bytes of payload.|
|........................|
|........................|
|........................|
|........................|
+------------------------+

Packet 2 of 2
+------------------------+
| 20 bytes control flags | \
| 4 byte from address    | |
| 4 byte to address      | |- IP and ICMP header: 28 bytes
|------------------------| |
| 8 byte ICMP header     | /
|------------------------|
|.                       |
| 1 byte of payload      |
|                        |
|                        |
|                        |
|                        |
|                        |
+------------------------+

This fragmentation will happen behind the scenes, ideally without you knowing.

But you can be mean, and tell the network that the packet is not allowed to be fragmented:

>ping -l 1473 -f obsidian

The -f flag means do not fragment. Now when you try to send a packet that doesn't fit on the network you get the error:

>ping -l 1473 -f obsidian  

Packet needs to be fragmented but DF set.

The packet needs to be fragmented, but the Do not Fragment flag was set.

If anywhere along the line a packet needed to be fragmented, the network actually sends an ICMP packet telling you that a fragmentation happened. Your machine gets this ICMP packet, is told what the largest size was, and is supposed to stop sending packets too big. Unfortunately most firewalls block these "Path MTU discovery" ICMP packets, so your machine never realizes the packets are being fragmented (or worse: dropped because they couldn't be fragmented).

That's what causes web-server to not work. You can get the initial small (<1280 byte) responses, but larger packets can't get through. And the web-server's firewalls are misconfigured, blocking ICMP packets. So the web-server doesn't realize you never got the packet.

Fragmentation of packets is not allowed in IPv6, everyone is required to (correctly) allow ICMP mtu discovery packets.

IPv6 routing problem

To be completely honest, I don't know why or how this fixes it...

My new route table


andrew@route:~$ ip -6 route
2001:470:XXXX:XXXX::1 dev 6in4  metric 1024  mtu 1480 advmss 1420 hoplimit 0
2001:470:XXXX:XXXX::/64 dev eth0  proto kernel  metric 256  mtu 1500 advmss 1440 hoplimit 0
fe80::/64 dev tap0  proto kernel  metric 256  mtu 1500 advmss 1440 hoplimit 0
fe80::/64 dev eth0  proto kernel  metric 256  mtu 1500 advmss 1440 hoplimit 0
fe80::/64 dev eth1  proto kernel  metric 256  mtu 1500 advmss 1440 hoplimit 0
fe80::/64 via :: dev 6in4  proto kernel  metric 256  mtu 1480 advmss 1420 hoplimit 0
default via 2001:470:XXXX:XXXX::1 dev 6in4  metric 1024  mtu 1480 advmss 1420 hoplimit 0

The rule that I ended up removing:


2001:470:XXXX:XXXX::/64 via :: dev 6in4  proto kernel  metric 256  mtu 1480 advmss 1420 hoplimit 0

After removing that rule(randomly of course) the "other computers" could ping out with no problem at all.

I've prevented my router from generating that rule in the future by removing the address from the 6in4 interface in /etc/network/interfaces.

Anyway, thank you for those of you who took the time to read my post.

Best Answer

Related Solutions

Windows – How to tell what MTU is being used in Windows XP

Windows 7, Windows Vista

Windows XP

See also

IPv6 routing problem

Related Question