I often come across the situation when developing, where I am running a binary file, say a.out
in the background as it does some lengthy job. While it's doing that, I make changes to the C code which produced a.out
and compile a.out
again. So far, I haven't had any problems with this. The process which is running a.out
continues as normal, never crashes, and always runs the old code from which it originally started.
However, say a.out
was a huge file, maybe comparable to the size of the RAM. What would happen in this case? And say it linked to a shared object file, libblas.so
, what if I modified libblas.so
during runtime? What would happen?
My main question is – does the OS guarantee that when I run a.out
, then the original code will always run normally, as per the original binary, regardless of the size of the binary or .so
files it links to, even when those .o
and .so
files are modfied during runtime?
I know there are these questions that address similar issues:
https://stackoverflow.com/questions/8506865/when-a-binary-file-runs-does-it-copy-its-entire-binary-data-into-memory-at-once
What happens if you edit a script during execution?
How is it possible to do a live update while a program is running?
Which have helped me understand a bit more about this but I don't think that they are asking exactly what I want, which is a general rule for the consequences of modifying a binary during execution
Best Answer
While the Stack Overflow question seemed to be enough at first, I understand, from your comments, why you may still have a doubt about this. To me, this is exactly the kind of critical situation involved when the two UNIX subsystems (processes and files) communicate.
As you may know, UNIX systems are usually divided into two subsystems: the file subsystem, and the process subsystem. Now, unless it is instructed otherwise through a system call, the kernel should not have these two subsystems interact with one another. There is however one exception: the loading of an executable file into a process' text regions. Of course, one may argue that this operation is also triggered by a system call (
execve
), but this is usually known to be the one case where the process subsystem makes an implicit request to the file subsystem.Because the process subsystem naturally has no way of handling files (otherwise there would be no point in dividing the whole thing in two), it has to use whatever the file subsystem provides to access files. This also means that the process subsystem is submitted to whatever measure the file subsystem takes regarding file edition/deletion. On this point, I would recommend reading Gilles' answer to this U&L question. The rest of my answer is based on this more general one from Gilles.
The first thing that should be noted is that internally, files are only accessible through inodes. If the kernel is given a path, its first step will be to translate it into a inode to be used for all other operations. When a process loads an executable into memory, it does it through its inode, which has been provided by the file subsystem after translation of a path. Inodes may be associated to several paths (links), and programs may only delete links. In order to delete a file and its inode, userland must remove all existing links to that inode, and ensure that it is completely unused. When these conditions are met, the kernel will automatically delete the file from disk.
If you have a look at the replacing executables part of Gilles' answer, you'll see that depending on how you edit/delete the file, the kernel will react/adapt differently, always through a mechanism implemented within the file subsystem.
ETXTBSY
). No consequences whatsoever.mv
operation is an atomic one. This will probably require the use of therename
system call, and since processes can't be interrupted while in kernel mode, nothing can interfere with this operation until it completes (successfully or not). Again, there is no alteration of the old file's inode: a new one is created, and already-running processes will have no knowledge of it, even if it's been associated with one of the old inode's links.Recompiling a file : when using
gcc
(and the behaviour is probably similar for many other compilers), you are using strategy 2. You can see that by running astrace
of your compiler's processes:stat
andlstat
system calls.a.out
, its inode and contents remain on disk, for as long as they are being used by already-running processes.a.out
. This is a brand new inode, and brand new contents, which already-running processes don't care about.Now, when it comes to shared libraries, the same behaviour will apply. As long as a library object is used by a process, it will not be deleted from disk, no matter how you change its links. Whenever something has to be loaded into memory, the kernel will do it through the file's inode, and will therefore ignore the changes you made to its links (such as associating them with new files).