Default TCP KeepAlive settings

defaultsnetworkingtcp

The TCP KeepAlive (socket option SO_KEEPALIVE) is governed by three options—time after which the mechanism triggers, probing interval, and number of failed probes after which the connecting is declared broken.

Their defaults are:

  • tcp_keepalive_time = 7200
  • tcp_keepalive_intvl = 75
  • tcp_keepalive_probes = 9

Sending probes after 1¼ minutes sound reasonable, and declaring failure after 9 failed probes does as well, but what is the idea behind the initial time being 2 hours?

Even tcp(7) says

Note that underlying connection tracking mechanisms and application timeouts may be much shorter.

The main point of enabling keepalive is to prevent any stateful network elements from dropping the state information, but such elements tend to drop the connections in a couple of minutes. With some rate-limited servers, curl with short --keepalive-time seems to significantly improve reliability of downloads.

So why is the default so long?

Best Answer

TCP Keep-alive was defined at a time when even the concept of firewall, let alone stateful firewall or NAT, was probably not widespread. From RFC 1122 (October 1989):

4.2.3.6 TCP Keep-Alives

Implementors MAY include "keep-alives" in their TCP
implementations, although this practice is not universally
accepted. If keep-alives are included, the application MUST
be able to turn them on or off for each TCP connection, and
they MUST default to off.

Keep-alive packets MUST only be sent when no data or
acknowledgement packets have been received for the
connection within an interval. This interval MUST be
configurable and MUST default to no less than two hours.

[...]

The main idea at the time wasn't about stateful information lost:

DISCUSSION:
A "keep-alive" mechanism periodically probes the other
end of a connection when the connection is otherwise
idle, even when there is no data to be sent. The TCP
specification does not include a keep-alive mechanism
because it could: (1) cause perfectly good connections
to break during transient Internet failures; (2)
consume unnecessary bandwidth ("if no one is using the
connection, who cares if it is still good?"); and (3)
cost money for an Internet path that charges for
packets
.

[...]

A TCP keep-alive mechanism should only be invoked in
server applications that might otherwise hang
indefinitely and consume resources unnecessarily if a
client crashes or aborts a connection during a network
failure.

I skimmed the updating RFCs, but couldn't fine mention of keep alives.