Will an `unlink` or `rename` portably and atomically make a `link` fail

concurrencyfilesportabilityrename

Question

Suppose I have some non-directory (file, named pipe/socket, whatever) at the pathname /tmp/foo and some other non-directory at the pathname /tmp/bar. Then two (or more) processes start executing concurrently:

Process one does:

unlink('/tmp/foo') /* or rename('/tmp/foo', '/tmp/removed') */
unlink('/tmp/bar') /* or rename('/tmp/bar', '/tmp/removed') */

Process two (and so on) does:

link('/tmp/foo', '/tmp/bar')

As I understand it, there is no way process two could possibly succeed (either the link(2) is attempted while /tmp/foo is still present, in which case /tmp/bar is also present so it must fail with EEXIST, or /tmp/foo is gone so is must fail with ENOENT).

But this intuition relies on the assumption that the unlink(2) and/or rename(2) system calls are inherently sequential in their unlinking effects, so I am looking for verification of my understanding: Is there any *nix-like system out there whose kernel allows the two unlink(2) and/or rename(2) calls to succeed, but simultaneously causes link(2) to succeed as well (whether due to re-order the unlinking of /tmp/foo and /tmp/bar and not abstracting/hiding that from the process calling link(2), or through through some other quirky race condition/bug)?

Current Understanding

I have read the manpages for unlink(2), rename(2), and link(2) for Linux and a few BSDs, and the POSIX specification for these functions. But I don't think they actually contain anything reassuring on this matter, upon careful consideration. At least with rename(2), we're promised that the destination is atomically replaced if it's already present (bugs in the OS itself aside), but nothing else.

I have seen claims that multiple simultaneous executions of rename(foo, qux) will atomically and portably have all but one rename fail with ENOENT – so that's promising! I am just uncertain if that can be extended to having a link(foo, bar) fail with ENOENT under the same circumstances as well.

Preferred Answers

I realize that this is one of those "can't prove a negative" situations – we can at best only note that there is no evidence that a *nix-like system which will allow process two's link(2) to succeed exists.

So what I'm looking for is answers covering as many *nix-like systems as possible (at least Linux, OS X, and the various BSDs, but ideally also the proprietary still-in-some-use systems like Solaris 10) – from people who have sufficient familiarity with these systems and this narrow set of problems (atomic/well-ordered file system operations) that they're confident (as much as one realistically can be) that they'd know of issues like the aforementioned Mac OS X rename(2)-not-actually-atomic bug if they existed on the platforms they're familiar with. That would give me enough confidence that this works the way I think it does in a portable-enough manner to rely on.

Final Note

This isn't an "X/Y problem" question – there's no underlying problem that can be answered by referring me to the various locking/IPC mechanisms or something else that works around the uncertainty about how these particular system calls interact: I specifically want to know if one can rely on the above system calls portably interacting as expected across *nix-like systems in practical use today.

Best Answer

Look at standards such as POSIX for portability guarantees. In practice, most POSIX-compliant systems have minor deviations from the specifications, but generally speaking you can rely on the guarantees given in the specification. Most modern unices comply with the specification even if they haven't been formally tested. They may need to be run in a POSIX mode, e.g. setting POSIXLY_CORRECT=1 with bash or making sure that /usr/xpg4/bin is ahead of /bin and /usr/bin in PATH on Solaris.

Single Unix v2 (an older extension of POSIX) has this to say about link:

The link() function will atomically create a new link for the existing file and the link count of the file is incremented by one.

About rename:

If the link named by the new argument exists, it is removed and old renamed to new. In this case, a link named new will remain visible to other processes throughout the renaming operation and will refer either to the file referred to by new or old before the operation began.

POSIX explicitly states that if the destination exists, its replacement must be atomic. It does not however state that the renaming itself must be atomic, i.e. that there is no point in time when both old and new refer to the file in question, or when neither does. In practice, those properties are true on unix systems, at least with local filesystems.

Furthermore the order of operations is guaranteed: in C, ; guarantees sequential execution; in sh, ;/newline guarantees sequential execution (as do && and so on); other programming languages offer similar guarantees. So in

unlink("/tmp/foo");
unlink("/tmp/bar");

it is guaranteed that there is no point in time when /tmp/foo exists but not /tmp/bar (assuming that /tmp/bar exists initially). Therefore a concurrent process executing link("/tmp/foo", "/tmp/bar") cannot succeed.

Note that atomicity does not guarantee resilience. Atomicity is about observable behavior on a live system. Resilience, in the context of filesystems, is about what happens in case of a system crash. Many filesystems sacrifice resilience for performance, so if the execution of unlink("foo"); unlink("bar"); is interrupted (with the current directory on on-disk storage), it is possible that bar will be deleted and foo will remain behind.

Some network filesystems give fewer guarantees when operations happen on different clients. Older implementations of NFS were notorious for this. I think modern implementations are better but I have no experience of modern NFS.