Typically, SSH terminal sessions hang if there are still background connections still open. By background connections, I mean things such as:
- X11 window forwarding
- STDOUT and STDERR
Have a look at the connections that are still active on your hung SSH session by typing ~#
in your hung SSH terminal.
It could be that your script is opening sessions that you didn't realize. Or your remote machine's terminal configs like .profile
(or .bashrc
, etc.) may have something in it that establishes a session. Good luck hunting!
By the way, some of the other escape sequences offered by OpenSSH clients may also be useful:
Supported escape sequences:
~. - terminate connection (and any multiplexed sessions)
~B - send a BREAK to the remote system
~C - open a command line
~R - Request rekey (SSH protocol 2 only)
~^Z - suspend ssh
~# - list forwarded connections
~& - background ssh (when waiting for connections to terminate)
~? - this message
~~ - send the escape character by typing it twice
(Note that escapes are only recognized immediately after newline.)
One other thing, if you want your SSH to just run your commands and immediately exit -- that is, you don't want a remote terminal session -- you can use the -f
option to ssh
. That will force the SSH connection to be a background job.
A process isn't "killed with SIGHUP" -- at least, not in the strict sense of the word. Rather, when the connection is dropped, the terminal's controlling process (in this case, Bash) is sent a hang-up signal*, which is commonly abbreviated the "HUP signal", or just SIGHUP.
Now, when a process receives a signal, it can handle it any way it wants**. The default for most signals (including HUP) is to exit immediately. However, the program is free to ignore the signal instead, or even to run some kind of signal handler function.
Bash chooses the last option. Its HUP signal handler checks to see if the "huponexit" option is true, and if so, sends SIGHUP to each of its child processes. Only once its finished with that does Bash exit.
Likewise, each child process is free to do whatever it wants when it receives the signal: leave it set to the default (i.e. die immediately), ignore it, or run a signal handler.
Nohup only changes the default action for the child process to "ignore". Once the child process is running, however, it's free change its own response to the signal.
This, I think, is why some programs die even though you ran them with nohup:
- Nohup sets the default action to "ignore".
- The program needs to do some kind of cleanup when it exits, so it installs a SIGHUP handler, incidentally overwriting the "ignore" flag.
- When the SIGHUP arrives, the handler runs, cleaning up the program's data files (or whatever needed to be done) and exits the program.
- The user doesn't know or care about the handler or cleanup, and just sees that the program exited despite nohup.
This is where "disown" comes in. A process that's been disowned by Bash is never sent the HUP signal, regardless of the huponexit option. So even if the program sets up its own signal handler, the signal is never actually sent, so the handler never runs. Note, however, that if the program tries to display some text to a user that's logged out, it will cause an I/O error, which could cause the program to exit anyway.
* And, yes, before you ask, the "hang-up" terminology is left over from UNIX's dialup mainframe days.
** Most signals, anyway. SIGKILL, for instance, always causes the program to terminate immediately, period.
Best Answer
Your process is probably dead but it still show up in the process table entry because it is a "zombie process". When a child process terminated and completely disappeared (except its process table entry) and the parent wouldnt be able to fetch its termination status (thru any of the wait functions), it is called zombie... Killing (thru signal) a zombie wont work because it is already terminated. What you need to do is find out its parent process and kill thjat one cleany, thus not using kill - 9
here are two simple steps to kill a zombie...