Linux – Modifying binary during execution

binarylinux

I often come across the situation when developing, where I am running a binary file, say a.out in the background as it does some lengthy job. While it's doing that, I make changes to the C code which produced a.out and compile a.out again. So far, I haven't had any problems with this. The process which is running a.out continues as normal, never crashes, and always runs the old code from which it originally started.

However, say a.out was a huge file, maybe comparable to the size of the RAM. What would happen in this case? And say it linked to a shared object file, libblas.so, what if I modified libblas.so during runtime? What would happen?

My main question is – does the OS guarantee that when I run a.out, then the original code will always run normally, as per the original binary, regardless of the size of the binary or .so files it links to, even when those .o and .so files are modfied during runtime?

I know there are these questions that address similar issues:
https://stackoverflow.com/questions/8506865/when-a-binary-file-runs-does-it-copy-its-entire-binary-data-into-memory-at-once
What happens if you edit a script during execution?
How is it possible to do a live update while a program is running?

Which have helped me understand a bit more about this but I don't think that they are asking exactly what I want, which is a general rule for the consequences of modifying a binary during execution

Best Answer

While the Stack Overflow question seemed to be enough at first, I understand, from your comments, why you may still have a doubt about this. To me, this is exactly the kind of critical situation involved when the two UNIX subsystems (processes and files) communicate.

As you may know, UNIX systems are usually divided into two subsystems: the file subsystem, and the process subsystem. Now, unless it is instructed otherwise through a system call, the kernel should not have these two subsystems interact with one another. There is however one exception: the loading of an executable file into a process' text regions. Of course, one may argue that this operation is also triggered by a system call (execve), but this is usually known to be the one case where the process subsystem makes an implicit request to the file subsystem.

Because the process subsystem naturally has no way of handling files (otherwise there would be no point in dividing the whole thing in two), it has to use whatever the file subsystem provides to access files. This also means that the process subsystem is submitted to whatever measure the file subsystem takes regarding file edition/deletion. On this point, I would recommend reading Gilles' answer to this U&L question. The rest of my answer is based on this more general one from Gilles.

The first thing that should be noted is that internally, files are only accessible through inodes. If the kernel is given a path, its first step will be to translate it into a inode to be used for all other operations. When a process loads an executable into memory, it does it through its inode, which has been provided by the file subsystem after translation of a path. Inodes may be associated to several paths (links), and programs may only delete links. In order to delete a file and its inode, userland must remove all existing links to that inode, and ensure that it is completely unused. When these conditions are met, the kernel will automatically delete the file from disk.

If you have a look at the replacing executables part of Gilles' answer, you'll see that depending on how you edit/delete the file, the kernel will react/adapt differently, always through a mechanism implemented within the file subsystem.

If you try strategy one (open/truncate to zero/write or open/write/truncate to new size), you'll see that the kernel won't bother handling your request. You'll get an error 26: Text file busy (ETXTBSY). No consequences whatsoever.
If you try strategy two, the first step is to delete your executable. However, since it is being used by a process, the file subsystem will kick in and prevent the file (and its inode) from being truly deleted from disk. From this point, the only way to access the old file's content is to do it through its inode, which is what the process subsystem does whenever it needs to load new data into text sections (internally, there is no point in using paths, except when translating them into inodes). Even though you've unlinked the file (removed all its paths), the process can still use it as if you'd done nothing. Creating a new file with the old path doesn't change anything: the new file will be given a completely new inode, which the running process has no knowledge of.

Strategies 2 and 3 are safe for executables as well: although running executables (and dynamically loaded libraries) aren't open files in the sense of having a file descriptor, they behave in a very similar way. As long as some program is running the code, the file remains on disk even without a directory entry.

Strategy three is quite similar since the mv operation is an atomic one. This will probably require the use of the rename system call, and since processes can't be interrupted while in kernel mode, nothing can interfere with this operation until it completes (successfully or not). Again, there is no alteration of the old file's inode: a new one is created, and already-running processes will have no knowledge of it, even if it's been associated with one of the old inode's links.

With strategy 3, the step of moving the new file to the existing name removes the directory entry leading to the old content and creates a directory entry leading to the new content. This is done in one atomic operation, so this strategy has a major advantage: if a process opens the file at any time, it will either see the old content or the new content — there's no risk of getting mixed content or of the file not existing.

Recompiling a file : when using gcc (and the behaviour is probably similar for many other compilers), you are using strategy 2. You can see that by running a strace of your compiler's processes:

stat("a.out", {st_mode=S_IFREG|0750, st_size=8511, ...}) = 0
unlink("a.out") = 0
open("a.out", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
chmod("a.out", 0750) = 0

The compiler detects that the file already exists through the stat and lstat system calls.
The file is unlinked. Here, while it is no longer accessible through the name a.out, its inode and contents remain on disk, for as long as they are being used by already-running processes.
A new file is created and made executable under the name a.out. This is a brand new inode, and brand new contents, which already-running processes don't care about.

Now, when it comes to shared libraries, the same behaviour will apply. As long as a library object is used by a process, it will not be deleted from disk, no matter how you change its links. Whenever something has to be loaded into memory, the kernel will do it through the file's inode, and will therefore ignore the changes you made to its links (such as associating them with new files).

Related Solutions

Ubuntu – executing binary file: file not found

If I run ldd -v as on my system, I get:

./as: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by ./as)
        linux-vdso.so.1 =>  (0x00007fff89ab1000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f1e4c81f000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1e4c498000)
        /lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f1e4ca6d000)

So yeah, it looks like these binaries are looking for a GLIBC_2.14 symbol, which you are presumably missing on your system. As svenx pointed out, it looks like it's searching for the memcpy@@GLIBC_2.14 symbol. Some more information on why memcpy was given a new version is described in this bug report.

Installing a new version of glibc on your target system should fix it. If you want to try to rebuild the binary to still work on the old version of glibc, you could try tricks like the one listed here. You could also maybe get by with a shim that just provides the specific version of the memcpy symbol that you need, but that gets to be a bit hacky.

After reading your update: you're right, that wasn't your problem. But I think I've found it: your binary is requesting the interpreter /lib/ld-linux-x86-64.so.2, which doesn't exist on Ubuntu 12.04 systems:

$ readelf -a ./as | grep interpreter
      [Requesting program interpreter: /lib/ld-linux-x86-64.so.2]

While ldd knew to find it in /lib64 instead, I suppose the kernel doesn't know that when it tries to run the binary and can't find the file's requested interpreter. You could try just running it through the interpreter manually:

$ pwd
/home/jim/mingw64/x86_64-w64-mingw32/bin
$ ./as --version
-bash: ./as: No such file or directory
$ /lib64/ld-linux-x86-64.so.2 ./as --version
GNU assembler (rubenvb-4.7.1-1-release) 2.23.51.20120808
Copyright 2012 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-w64-mingw32'.

I'm not 100% certain this is working correctly -- on my system, running gcc this way gives a segmentation fault. But that's at least a different problem.

Linux – Oldest binary working on Linux

I think that /bin/true has to be the oldest working ..

Well, can you call a zero-byte file a binary?

touch /tmp/old_true
chmod 755 /tmp/old_true
/tmp/old_true
echo $?

Best Answer

Related Solutions

Ubuntu – executing binary file: file not found

Linux – Oldest binary working on Linux

Related Question