TCP keep-alive parameters not being honoured

tcptcp/ip

I have a server running on my Linux box, also on which the following commands were run:

$ cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
$ cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
$ cat /proc/sys/net/ipv4/tcp_keepalive_probes
9

My server listens on port 58080, and I create a connection on it having set TCP keep-alive in my code. I then set Wireshark tracing this connection; a screenshot of the output is shown below:

You can see that the first keep-alive packet is sent after 7200 seconds, or 2 hours as expected (the value of 'tcp_keepalive_time'). However I would also expect each probe to be sent at 75 seconds (the value of 'tcp_keepalive_intvl'); what I see though is that each probe is sent at 2 hours.

Can someone please tell me why my configuration option for 'tcp_keepalive_intvl' is not being honoured?

UPDATE

It seems that specifying a keep-alive interval that is greater than the keep-alive time results in the interval time being adhered to…

Best Answer

After discussion with a super-smart colleague, we think we have figured out what's wrong, but are yet to prove this. What is likely happening is that the keep-alive interval is only taken into account when no keep-alive ACK is received. So after 2 hours, if no ACK was received to this first keep-alive packet, a second packet would be sent after 75 seconds and repeatedly at 75 second intervals until an ACK was received.

It turns out that the reason the interval is taken into account only when it is greater than the keep-alive time is due to the way the Linux keep-alive mechanism works, as described on my other question.

Related Solutions

Linux – How long is a TCP local socket address that has been bound unavailable after closing

I believe that the idea of the socket being unavailable to a program is to allow any TCP data segments still in transit to arrive, and get discarded by the kernel. That is, it's possible for an application to call close(2) on a socket, but routing delays or mishaps to control packets or what have you can allow the other side of a TCP connection to send data for a while. The application has indicated it no longer wants to deal with TCP data segments, so the kernel should just discard them as they come in.

I hacked out a little program in C that you can compile and use to see how long the timeout is:

#include <stdio.h>        /* fprintf() */
#include <string.h>       /* strerror() */
#include <errno.h>        /* errno */
#include <stdlib.h>       /* strtol() */
#include <signal.h>       /* signal() */
#include <sys/time.h>     /* struct timeval */
#include <unistd.h>       /* read(), write(), close(), gettimeofday() */
#include <sys/types.h>    /* socket() */
#include <sys/socket.h>   /* socket-related stuff */
#include <netinet/in.h>
#include <arpa/inet.h>    /* inet_ntoa() */
float elapsed_time(struct timeval before, struct timeval after);
int
main(int ac, char **av)
{
        int opt;
        int listen_fd = -1;
        unsigned short port = 0;
        struct sockaddr_in  serv_addr;
        struct timeval before_bind;
        struct timeval after_bind;

        while (-1 != (opt = getopt(ac, av, "p:"))) {
                switch (opt) {
                case 'p':
                        port = (unsigned short)atoi(optarg);
                        break;
                }
        }

        if (0 == port) {
                fprintf(stderr, "Need a port to listen on\n");
                return 2;
        }

        if (0 > (listen_fd = socket(AF_INET, SOCK_STREAM, 0))) {
                fprintf(stderr, "Opening socket: %s\n", strerror(errno));
                return 1;
        }

        memset(&serv_addr, '\0', sizeof(serv_addr));
        serv_addr.sin_family      = AF_INET;
        serv_addr.sin_addr.s_addr = htonl(INADDR_ANY);
        serv_addr.sin_port        = htons(port);

        gettimeofday(&before_bind, NULL);
        while (0 > bind(listen_fd, (struct sockaddr *)&serv_addr, sizeof(serv_addr))) {
                fprintf(stderr, "binding socket to port %d: %s\n",
                        ntohs(serv_addr.sin_port),
                        strerror(errno));

                sleep(1);
        }
        gettimeofday(&after_bind, NULL);
        printf("bind took %.5f seconds\n", elapsed_time(before_bind, after_bind));

        printf("# Listening on port %d\n", ntohs(serv_addr.sin_port));
        if (0 > listen(listen_fd, 100)) {
                fprintf(stderr, "listen() on fd %d: %s\n",
                        listen_fd,
                        strerror(errno));
                return 1;
        }

        {
                struct sockaddr_in  cli_addr;
                struct timeval before;
                int newfd;
                socklen_t clilen;

                clilen = sizeof(cli_addr);

                if (0 > (newfd = accept(listen_fd, (struct sockaddr *)&cli_addr, &clilen))) {
                        fprintf(stderr, "accept() on fd %d: %s\n", listen_fd, strerror(errno));
                        exit(2);
                }
                gettimeofday(&before, NULL);
                printf("At %ld.%06ld\tconnected to: %s\n",
                        before.tv_sec, before.tv_usec,
                        inet_ntoa(cli_addr.sin_addr)
                );
                fflush(stdout);

                while (close(newfd) == EINTR) ;
        }

        if (0 > close(listen_fd))
                fprintf(stderr, "Closing socket: %s\n", strerror(errno));

        return 0;
}
float
elapsed_time(struct timeval before, struct timeval after)
{
        float r = 0.0;

        if (before.tv_usec > after.tv_usec) {
                after.tv_usec += 1000000;
                --after.tv_sec;
        }

        r = (float)(after.tv_sec - before.tv_sec)
                + (1.0E-6)*(float)(after.tv_usec - before.tv_usec);

        return r;
}

I tried this program on 3 different machines, and I get a variable time, between 55 and 59 seconds, when the kernel refuses to allow a non-root user to reopen a socket. I compiled the above code to an executable named "opener", and ran it like this:

./opener -p 7896; ./opener -p 7896

I opened another window and did this:

telnet otherhost 7896

That causes the first instance of "opener" to accept a connection, then close it. The second instance of "opener" tries to bind(2) to the TCP port 7896 every second. "opener" reports 55 to 59 seconds of delay.

Googling around, I find that people recommend doing this:

echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout

to reduce that interval. It didn't work for me. Of the 4 linux machines I had access to, two had 30 and two had 60. I also set that value as low as 10. No difference to the "opener" program.

Doing this:

echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle

did change things. The second "opener" only took about 3 seconds to get its new socket.

Linux – /dev/tcp Not Present

t_open() and its associated /dev/tcp and such are part of the TLI/XTI interface, which lost the battle for TCP/IP APIs to BSD sockets.

On Linux, there is a /dev/tcp of sorts. It isn't a real file or kernel device. It's something specially provided by Bash, and it exists only for redirections. This means that even if one were to create an in-kernel /dev/tcp facility, it would be masked in interactive use 99%[*] of the time by the shell.

The best solution really is to switch to BSD sockets. Sorry.

You might be able to get the strxnet XTI emulation layer to work, but you're better off putting your time into getting off XTI. It's a dead API, unsupported not just on Linux, but also on the BSDs, including OS X.

(By the way, the strxnet library won't even build on the BSDs, because it depends on LiS, a component of the Linux kernel. It won't even configure on a stock BSD or OS X system, apparently because it also depends on GNU sed.)

[*] I base this wild guess on the fact that Bash is the default shell for non-root users in all Linux distros I've used. You therefore have to go out of your way on Linux, as a rule, to get something other than Bash.

Best Answer

Related Solutions

Linux – How long is a TCP local socket address that has been bound unavailable after closing

Linux – /dev/tcp Not Present

Related Question