SSH – How Does TCP-Keepalive Work in SSH

clustersshsshdtcp

I am trying to code a shell-script that uses a ssh-connection for doing "heartbeats". I want to terminate the client- and server-side of that connection after a certain timeout (after the connection drops).

What I found so far:

  • TCPKeepAlive yes/no for ssh and sshd
  • ClientAliveCountMax for sshd
  • ClientAliveInterval for sshd
  • ServerAliveCountMax for ssh
  • ServerAliveInterval for ssh

To change "ClientAliveCountMax" I would have to modify the sshd_config on each target machine (this option is disabled by default).

So my question is – can I use "TCPKeepAlive" for my purposes, too (without changing anything else on the source/target machines)?

Target operating system is SLES11 SP2 – but I do not think that is relevant here.

Best Answer

You probably want to use the ServerAlive settings for this. They do not require any configuration on the server, and can be set on the command line if you wish.

ssh -o ServerAliveInterval=5 -o ServerAliveCountMax=1 $HOST

This will send a ssh keepalive message every 5 seconds, and if it comes time to send another keepalive, but a response to the last one wasn't received, then the connection is terminated.

The critical difference between ServerAliveInterval and TCPKeepAlive is the layer they operate at.

  • TCPKeepAlive operates on the TCP layer. It sends an empty TCP ACK packet. Firewalls can be configured to ignore these packets, so if you go through a firewall that drops idle connections, these may not keep the connection alive.
  • ServerAliveInterval operates on the ssh layer. It will actually send data through ssh, so the TCP packet has encrypted data in and a firewall can't tell if its a keepalive, or a legitimate packet, so these work better.