Is `ln` atomic and reliable on NFS? Could NFS replace GFS in this use case

concurrencygfslnlocknfs

I have a cluster with a bunch of servers with a shared disk containing a GFS global file system that all nodes access simultaneously.

Each node in the cluster run the same program (a shell script is the main core).
The system processes files that appear in a couple of input directories, and it works like this:

  • the program loops through the input directories.
  • for each file found, check existence of a "lock file", if lock file exists skip to next file.
  • if no lock file found, create lock file. If lockfile creation failed (race lost), skip to next file
  • if "we" own the lock, process the file and move it out of the way when it is finished.

This all works very well, but I wonder if there are cheaper (less complex) solutions that would also work. I'm thinking NFS or SMB perhaps.

There are two reasons for my use of GFS:

  1. each file is stored in one place only (on redundant underlying hardware of course)
  2. file locking works reliably

I create the lockfile like this:

date '+%s:'${unid} > ${currlock}.${unid}
ln ${currlock}.${unid} ${currlock}
lockrc=$?
rm -f ${currlock}.${unid}

where $unid is a unique session identifier and $currlock is /gfs/tmp/lock.${file_to_process}

The beauty of ln is that it is atomic, so it fails for all but one that attempts the same thing at the same time.

So, I guess what I'm asking is: will NFS fill my needs? Does ln work reliably in the same way on NFS as on GFS?

Best Answer

The link() system call on the NFS client should map directly to the NFS LINK operation, which the server should implement using its link() system call. So as long as link() is atomic on the server, it will also be atomic on the clients.

Related Question