Linux – What are the essential iptables rules for IPv6 to work properly

ip6tablesiptablesipv6linuxnetfilter

I had a problem where I lost connectivity to a server on the IPv6 address after some time and it turned out to be caused by DHCPv6 client packets (port 546) being dropped by the default INPUT policy of DROP, this is my question about the problem, my rules were:

-A INPUT -i lo -j ACCEPT
-A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p ipv6-icmp -j ACCEPT
-A INPUT -s IP_OF_ANOTHER_HOST -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 443 -j ACCEPT
-P INPUT DROP

I thought that these rules are enough especially allowing RELATED and ESTABLISHED connections as my OUTPUT chain's default policy is ACCEPT, but I had to add this rule to accept DHCPv6 client packets:

-A INPUT -m conntrack --ctstate NEW -m udp -p udp --dport 546 -d fe80::/64 -j ACCEPT

The thing is I don't want to add more rules that I might not need, I want to keep my rules as simple as possible.

So what are the essential rules that must be set for IPv6 to work properly ? Should I also enable DHCPv6 server port 547 ? and is it OK to accept all ICMPv6 packets ?

Best Answer

The essential rules will depend on the network as a network might instead use SLAAC instead of DHCPv6, or there can be other complications depending on tunnels, ICMP handling, etc.

-A INPUT -m conntrack --ctstate NEW -m udp -p udp --dport 546 -d fe80::/64 -j ACCEPT

is suitable for a DHCPv6 client. DHCP clients should not accept server port 547 traffic as presumably they are not also a DHCP server. Packets will come from the DHCP server from port 547 to port 546 on the client; connection tracking will not apply as the client broadcasts (or really multicasts under IPv6) and the server replies from an address unrelated to where the client broadcasted to.

This is fairly safe as root is necessary to listen on ports <1024 so random users on the client system should not be able to start a malicious service there by default (maybe they could DoS network access?). fe80 is link-local traffic so remote malicious users on some other subnet should not be able to route traffic to that port (if you have malicious users on your subnet you probably have other more important problems to deal with, such as the use of network gear that prevents rogue DHCP servers).

ICMPv6 can get very complicated depending on what you want to permit or deny, though probably can be handled with the connection tracking defaults for a simple IPv6 client. See RFC 4443 and RFC 4890 for more details.

Related Solutions

Iptables: Matching Outgoing Traffic with Conntrack and Owner – Troubleshooting Drops

To cut a long story short, that ACK was sent when the socket didn't belong to anybody. Instead of allowing packets that pertain to a socket that belongs to user x, allow packets that pertain to a connection that was initiated by a socket from user x.

The longer story.

To understand the issue, it helps to understand how wget and HTTP requests work in general.

wget http://cachefly.cachefly.net/10mb.test

wget establishes a TCP connection to cachefly.cachefly.net, and once established sends a request in the HTTP protocol that says: "Please send me the content of /10mb.test (GET /10mb.test HTTP/1.1) and by the way, could you please not close the connection after you're done (Connection: Keep-alive). The reason it does that is because in case the server replies with a redirection for a URL on the same IP address, it can reuse the connection.

Now the server can reply with either, "here comes the data you requested, beware it's 10MB large (Content-Length: 10485760), and yes OK, I'll leave the connection open". Or if it doesn't know the size of the data, "Here's the data, sorry I can't leave the connection open but I'll tell when you can stop downloading the data by closing my end of the connection".

In the URL above, we're in the first case.

So, as soon as wget has obtained the headers for the response, it knows its job is done once it has downloaded 10MB of data.

Basically, what wget does is read the data until 10MB have been received and exit. But at that point, there's more to be done. What about the server? It's been told to leave the connection open.

Before exiting, wget closes (close system call) the file descriptor for the socket. Upon, the close, the system finishes acknowledging the data sent by the server and sends a FIN to say: "I won't be sending any more data". At that point close returns and wget exits. There is no socket associated to the TCP connection anymore (at least not one owned by any user). However it's not finished yet. Upon receiving that FIN, the HTTP server sees end-of-file when reading the next request from the client. In HTTP, that means "no more request, I'll close my end". So it sends its FIN as well, to say, "I won't be sending anything either, that connection is going away".

Upon receiving that FIN, the client sends a "ACK". But, at that point, wget is long gone, so that ACK is not from any user. Which is why it is blocked by your firewall. Because the server doesn't receive the ACK, it's going to send the FIN over and over until it gives up and you'll see more dropped ACKs. That also means that by dropping those ACKs, you're needlessly using resources of the server (which needs to maintain a socket in the LAST-ACK state) for quite some time.

The behavior would have been different if the client had not requested "Keep-alive" or the server had not replied with "Keep-alive".

As already mentioned, if you're using the connection tracker, what you want to do is let every packet in the ESTABLISHED and RELATED states through and only worry about NEW packets.

If you allow NEW packets from user x but not packets from user y, then other packets for established connections by user x will go through, and because there can't be established connections by user y (since we're blocking the NEW packets that would establish the connection), there will not be any packet for user y connections going through.

Implicit Inverses for iptables NAT Rules

Does iptables implicitly and automatically add the reverse/inverse rules for every NAT rule that is added explicitly?

Not exactly

Your first two quotes are correct, the third is confused ramblings of someone who doesn't understand how the system works.

iptables nat (unlike iptables filtering) works on connections. The first packet of the connection passes through the nat tables and is translated according to it. Later packets belonging to the same connection do not pass through the nat tables they are simply translated accoridng to the rules established when the first packet was translated.

The iptables man page https://linux.die.net/man/8/iptables documents that the nat table is consulted for "the first packet of a connection" and the man page section for the DNAT and SNAT target say "(and all future packets in this connection will also be mangled)".

Unfortunately I haven't seen any official documentation which goes into more depth than that. My go-to reference for iptables is the frozentux iptables tutorial but I don't think it's official.

Best Answer

Related Solutions

Iptables: Matching Outgoing Traffic with Conntrack and Owner – Troubleshooting Drops

Implicit Inverses for iptables NAT Rules

Related Question