Default TCP KeepAlive settings

defaultsnetworkingtcp

The TCP KeepAlive (socket option SO_KEEPALIVE) is governed by three options—time after which the mechanism triggers, probing interval, and number of failed probes after which the connecting is declared broken.

Their defaults are:

tcp_keepalive_time = 7200
tcp_keepalive_intvl = 75
tcp_keepalive_probes = 9

Sending probes after 1¼ minutes sound reasonable, and declaring failure after 9 failed probes does as well, but what is the idea behind the initial time being 2 hours?

Even tcp(7) says

Note that underlying connection tracking mechanisms and application timeouts may be much shorter.

The main point of enabling keepalive is to prevent any stateful network elements from dropping the state information, but such elements tend to drop the connections in a couple of minutes. With some rate-limited servers, curl with short --keepalive-time seems to significantly improve reliability of downloads.

So why is the default so long?

Best Answer

TCP Keep-alive was defined at a time when even the concept of firewall, let alone stateful firewall or NAT, was probably not widespread. From RFC 1122 (October 1989):

4.2.3.6 TCP Keep-Alives

Implementors MAY include "keep-alives" in their TCP
implementations, although this practice is not universally
accepted. If keep-alives are included, the application MUST
be able to turn them on or off for each TCP connection, and
they MUST default to off.

Keep-alive packets MUST only be sent when no data or
acknowledgement packets have been received for the
connection within an interval. This interval MUST be
configurable and MUST default to no less than two hours.

[...]

The main idea at the time wasn't about stateful information lost:

DISCUSSION:
A "keep-alive" mechanism periodically probes the other
end of a connection when the connection is otherwise
idle, even when there is no data to be sent. The TCP
specification does not include a keep-alive mechanism
because it could: (1) cause perfectly good connections
to break during transient Internet failures; (2)
consume unnecessary bandwidth ("if no one is using the
connection, who cares if it is still good?"); and (3)
cost money for an Internet path that charges for
packets.

[...]

A TCP keep-alive mechanism should only be invoked in
server applications that might otherwise hang
indefinitely and consume resources unnecessarily if a
client crashes or aborts a connection during a network
failure.

I skimmed the updating RFCs, but couldn't fine mention of keep alives.

Related Solutions

SSH – How Does TCP-Keepalive Work in SSH

You probably want to use the ServerAlive settings for this. They do not require any configuration on the server, and can be set on the command line if you wish.

ssh -o ServerAliveInterval=5 -o ServerAliveCountMax=1 $HOST

This will send a ssh keepalive message every 5 seconds, and if it comes time to send another keepalive, but a response to the last one wasn't received, then the connection is terminated.

The critical difference between ServerAliveInterval and TCPKeepAlive is the layer they operate at.

TCPKeepAlive operates on the TCP layer. It sends an empty TCP ACK packet. Firewalls can be configured to ignore these packets, so if you go through a firewall that drops idle connections, these may not keep the connection alive.
ServerAliveInterval operates on the ssh layer. It will actually send data through ssh, so the TCP packet has encrypted data in and a firewall can't tell if its a keepalive, or a legitimate packet, so these work better.

Networking – Is it Possible to Connect to TCP Port 0?

Just to make sure we're on the same page (your question is ambiguous this way), asking to bind TCP on port 0 indicates a request to dynamically generate an unused port number. In other words, the port number you're actually listening on after that request is not zero. There's a comment about this in [linux kernel source]/net/ipv4/inet_connection_sock.c on inet_csk_get_port():

/* Obtain a reference to a local port for the given sock,
 * if snum is zero it means select any available local port.
 */

Which is a standard unix convention. There could be systems that will actually allow use of port 0, but that would be considered a bad practice. This behaviour is not officially specified by POSIX, IANA, or the TCP protocol, however.¹ You may find this interesting.

That's why you cannot sensibly make a TCP connection to port zero. Presumably nc is aware of this and informs you you're making a non-sensical request. If you try this in native code:

int fd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = 0;
inet_aton("127.0.0.1", &addr.sin_addr);
if (connect(fd, (const struct sockaddr*)&addr, sizeof(addr)) == -1) {
    fprintf(stderr,"%s", strerror(errno));
}

You get the same error you would trying to connect to any other unavailable port: ECONNREFUSED, "Connection refused". So in reply to:

Where in the system is this handled? In the TCP stack of the OS kernel?

Probably not; it doesn't require special handling. I.e., if you can find a system that allows binding and listening on port 0, you could presumably connect to it.

^{1. But IANA does refer to it as "Reserved" (see here). Meaning, this port should not be used online. That makes it okay with regard to the dynamic assignment convention (since it won't actually be used). Stipulating that specifically as a purpose would probably be beyond the scope of IANA; in essence operating systems are free to do what they want with it, including nothing.}

Best Answer

Related Solutions

SSH – How Does TCP-Keepalive Work in SSH

Networking – Is it Possible to Connect to TCP Port 0?

Related Question