Tmux – Prevent Session from Being Killed When Disconnecting from SSH

sshdtmux

Summary: I'm trying to figure out why my tmux session dies when I disconnect from ssh

Details:

I have tmux installed on an Arch Linux system. When I start a tmux session I can detach from it and then attach again while the ssh session is active. But if I end my ssh session then the tmux session gets killed.

I know this is not the normal behavior because I have other system where the tmux session continues running even if the ssh session is ended and I can attach to the tmux session after establishing a new ssh connection. The system that has a problem and the one that works correctly have very similar configurations so I'm not sure what to check.

I'm running tmux version 1.9a. The system that has a problem (that I have root access for) has a Linux kernel version of 3.17.4-1 and the system that works correct has kernel version 3.16.4-1-ARCH (I don't have root on that system). I doubt that the kernel version is the source of the problem though, that's just one difference I noticed.

I thought I'd ask to see if anyone has seen a similar problem and knows of a possible solution.

The precise steps that lead to the problem are:

  1. ssh to machine
  2. run tmux to start tmux
  3. ctrl-B D to detach (at this point I could reattach with tmux attach
  4. close ssh session (at this point the tmux session is killed, I've been able to observe this when I'm logged in as root in a different terminal)
  5. reconnect with ssh and run tmux attach and I get the message no sessions and running tmux ls returns failed to connect to server: Connection refused. This makes sense because the serve is not running. What doesn't make sense to me is why it gets killed in step 4 when I disconnect from the ssh session.

strace data:

In response to one of the comments I used strace to see what systems calls the tmux server process makes. It looks like when I exit my ssh session (by typing exit or with ctrl-d) that the tmux process is being killed. Here's a snippet of the final part of the strace output.

poll([{fd=4, events=POLLIN}, {fd=11, events=POLLIN}, {fd=6, events=POLLIN}], 3, 424) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=1, si_uid=0} ---
sendto(3, "\17", 1, 0, NULL, 0)         = 1
+++ killed by SIGKILL +++

I compared this with a different system where tmux works properly and on that system the tmux process continues running even after I exit. So the root cause appears to be that the tmux process is being terminated when I close the ssh session. I'll need to spend some time troubleshooting this to figure out why, but I thought I would update my question since the strace suggestion was useful.

Best Answer

Theory

Some init systems including systemd provide a feature to kill all processes belonging to the service. The service typically starts a single process which that creates more processes by forking and those processes can do that as well. All such processes are typically considered part of the service. In systemd this is done using cgroups.

In systemd, all processes belonging to a service are killed when the service is stopped by default. The SSH server is obviously part of the service. When you connect to the server, SSH server typically forks and the new process handles your SSH session. By forking from the SSH session process or its children, other server side processes are started, including your screen or tmux.

Killmode and socket activation

The default behavior can be changed using the KillMode directive. The upstream project doesn't AFAIK include any .service files and so those vary by distribution. There are typically two ways to enable SSH on your system. One is the classic ssh.service that maintains a long running SSH daemon listening on the network. The other is via socket activation handled by the ssh.socket that in turn starts sshd@.service which only runs for a single SSH session.

Solutions

If your processes get killed at the end of the session, it is possible that you are using socket activation and it gets killed by systemd when it notices that the SSH session process exited. In that case there are two solutions. One is to avoid using socket activation by using ssh.service instead of ssh.socket. The other is to set KillMode=process in the Service section of ssh@.service.

The KillMode=process setting may also be useful with the classic ssh.service, as it avoids killing the SSH session process or the screen or tmux processes when the server gets stopped or restarted.

Future notes

This answer apparently gained a level of popularity. While it worked for the OP it might happen that it doesn't work for someone in the future due to systemd-logind development or configuration. Please check documentation on logind sessions if you experience behavior different from the description in this answer.

Related Question