Why SSH -t Doesn’t Wait for Background Processes

background-processfile-descriptorsprocessshellssh

Why is it that ssh -t doesn't wait for background jobs to finish?

Example:

ssh user@example 'sleep 2 &'

This works as expected, since ssh returns after 2 seconds, whereas

ssh user@example -t 'sleep 2 &'

does not wait for sleep to finish and returns immediately.

Can anyone explain the reason behind this? Is there a way to let ssh -t wait for all background processes to finish before returning?

My use case is that I start a script with ssh -t, and this script starts several background jobs that should stay alive after the main script finishes. With ssh -t this is not possible so far.

Best Answer

Without -t, sshd gets the stdout of the remote shell (and children like sleep) and stderr via two pipes (and also sends the client's input via another pipe).

sshd does wait for the process in which it has started the user's login shell, but also, after that process has terminated waits for eof on the stdout pipe (not the stderr pipe in the case of openssh at least).

And eof happens when there's no file descriptor by any process open on the writing end of the pipe, which typically only happens when all the processes that didn't have their stdout redirected to something else are gone.

When you use -t, sshd doesn't use pipes. Instead, all the interaction (stdin, stdout, stderr) with the remote shell and its children are done using one pseudo-terminal pair.

With a pseudo-terminal pair, for sshd interacting with the master side, there's no similar eof handling and while at least some systems provide alternative ways to know if there are still processes with fds open to the slave side of the pseudo-terminal (see @JdeBP comment below), sshd doesn't use them, so it just waits for the termination of the process in which it executed the login shell of the remote user and then exits.

Upon that exit, the master side of the pty pair is closed which means the pty is destroyed, so processes controlled by the slave will receive a SIGHUP (which by default would terminate them).

Edit: that last part was incorrect, though the end result is the same. See @pynexj's answer for a correct description of what exactly happens.

Related Question