How is it possible to do a live update while a program is running

executablefilesupgrade

I wonder how killer applications such as Thunderbird or Firefox can be updated via the system's package manager while they are still running. What happens with the old code while they are being updated? What do I have to do when I want to write a program a.out that updates itself while it is running?

Best Answer

Replacing files in general

First, there are several strategies to replace a file:

  1. Open the existing file for writing, truncate it to 0 length, and write the new content. (A less common variant is to open the existing file, overwrite the old content with the new content, truncate the file to the new length if it's shorter.) In shell terms:

    echo 'new content' >somefile
    
  2. Remove the old file, and create a new file by the same name. In shell terms:

    rm somefile
    echo 'new content' >somefile
    
  3. Write to a new file under a temporary name, then move the new file to the existing name. The move deletes the old file. In shell terms:

    echo 'new content' >somefile.new
    mv somefile.new somefile
    

I won't list all the differences between the strategies, I'll just mention some that are important here. With stategy 1, if any process is currently using the file, the process sees the new content as it's being updated. This can cause some confusion if the process expects the file content to remain the same. Note that this is only about processes that have the file open (as visible in lsof or in /proc/PID/fd/; interactive applications that have a document open (e.g. opening a file in an editor) usually do not keep the file open, they load the file content during the “open document” operation and they replace the file (using one of the strategies above) during the “save document” operation.

With strategies 2 and 3, if some process has the file somefile open, the old file remains open during the content upgrade. With strategy 2, the step of removing the file in fact only removes the file's entry in the directory. The file itself is only removed when it has no directory entry leading to it (on typical Unix filesystems, there can be more than one directory entry for the same file) and no process has it open. Here's a way to observe this — the file is only removed when the sleep process is killed (rm only removes its directory entry).

echo 'old content' >somefile
sleep 9999999 <somefile &
df .
rm somefile
df .
cat /proc/$!/fd/0
kill $!
df .

With strategy 3, the step of moving the new file to the existing name removes the directory entry leading to the old content and creates a directory entry leading to the new content. This is done in one atomic operation, so this strategy has a major advantage: if a process opens the file at any time, it will either see the old content or the new content — there's no risk of getting mixed content or of the file not existing.

Replacing executables

If you try strategy 1 with a running executable on Linux, you'll get an error.

cp /bin/sleep .
./sleep 999999 &
echo oops >|sleep
bash: sleep: Text file busy

A “text file” means a file containing executable code for obscure historical reasons. Linux, like many other unix variants, refuses to overwrite the code of a running program; a few unix variants allow this, leading to crashes unless the new code was a very well though-out modification of the old code.

On Linux, you can overwrite the code of a dynamically loaded library. It's likely to lead to a crash of the program that's using it. (You might not be able to observe this with sleep because it loads all the library code it needs when it starts. Try a more complex program that does something useful after sleeping, like perl -e 'sleep 9; print lc $ARGV[0]'.)

If an interpreter is running a script, the script file is opened in an ordinary way by the interpreter, so there is no protection against overwriting the script. Some interpreters read and parse the whole script before they start executing the first line, others read the script as needed. See What happens if you edit a script during execution? and How Does Linux deal with shell scripts? for more details.

Strategies 2 and 3 are safe for executables as well: although running executables (and dynamically loaded libraries) aren't open files in the sense of having a file descriptor, they behave in a very similar way. As long as some program is running the code, the file remains on disk even without a directory entry.

Upgrading an application

Most package managers use strategy 3 to replace files, because of the major advantage mentioned above — at any point in time, opening the file leads to a valid version of it.

Where application upgrades can break is that while upgrading one file is atomic, upgrading the application as a whole isn't if the application consists of multiple files (program, libraries, data, …). Consider the following sequence of events:

  1. An instance of the application is started.
  2. The application is upgraded.
  3. The running instance application opens one of its data files.

In step 3, the running instance of the old version of the application is opening a data file from the new version. Whether this works or not depends on the application, of which file it is and how much the file has been modified.

After an upgrade, you'll note that the old program is still running. If you want to run the new version, you'll have to exit the old program and run the new version. Package managers usually kill and restart daemons on an upgrade, but leave end-user applications alone.

A few daemons have special procedures to handle upgrades without having to kill the daemon and wait for the new instance to restart (which causes a service disruption). This is necessary in the case of init, which cannot be killed; init systems provide a way to request that the running instance call execve to replace itself with the new version.

Related Question