Two entries with identical name and inode in same directory

afpfilesystemfinder

I have just seen the most remarkable thing in a deep corner of my large archive disk: a single directory containing two entries (subdirectories) with the same name and same inode number.

% ls -liaF path/to/directory/with/duplicate/entries
total 56
1227293 drwxr-xr-x  1 jdlh  staff    264  3 Jan  2016 ./
1227288 drwxr-xr-x  1 jdlh  staff    264  3 Jan  2016 ../
1227364 drwxr-xr-x  1 jdlh  staff    264 20 Feb  2009 .externalToolBuilders/
1227364 drwxr-xr-x  1 jdlh  staff    264 20 Feb  2009 .externalToolBuilders/
1227367 -rw-rw-rw-  1 jdlh  staff    859 20 Feb  2009 other_files

There are two entries named externalToolBuilders/ in this directory. They appear to share the same inode number, so they refer to the same thing, not just to two things which share the same name.

When I try to copy this directory to another file system volume using Finder, I get an error that "a directory with the name .externalToolBuilders already exists". The copy aborts at that point. I can refer to both entries with a single wildcard like diff -rq .ext*, and the command treats this as expanding to two arguments.

However, I can use some command line commands like cp -r, referring to the duplicate subdirectory, and only one of the entries gets copied.

It looks like this directory was expanded from a .zip archive. It is possible that the .zip archive was constructed incorrectly to have two entries with the same name, that the archive utility did not catch the mistake, but that Finder did.

[Update after more investigation: the problem is observed in directories which were not expanded from a .zip archive. The problem is observed in several locations, but all observed examples are on a volume hosted by a QNAP Network Attached Storage (NAS) appliance, served to Mac OS through netatalk AFP support.]

Has anyone seen this on a Mac OS file system before? I find it most curious. How can it be constructed at will? Are there known drawbacks of it? How can one clean it up?

Best Answer

I finally figured out the underlying cause. The following is adapted from my blog post with my findings.

I connected to the shell of the original server's underlying Linux operating system. I looked at the corresponding directory in the server's underlying file system. This is what I saw (bowlderised for confidentiality and simplified for clarity):

% ls -la /share/Volume/path/to/directory/with/duplicate/entries
total 56
drwxr-xr-x  1 myuser  everyone      4096 Jan  3  2016 ./
drwxr-xr-x  1 myuser  everyone      4096 Aug 24 00:21 ../
drwxr-xr-x  1 myuser  everyone      4096 Aug 24 01:32 .externalToolBuilders/
drwxr-xr-x  1 myuser  everyone      4096 Feb 20  2009 :2eexternalToolBuilders/
-rw-rw-rw-  1 myuser  everyone      4096 Feb 20  2009 other_files

The crucial observation is the directory name, :2eexternalToolBuilders. It begins with the string ":2e", while the other directory entry begins with the string ".". From the point of view of the underlying operating system and file system, there are no duplicate entries in this directory. The two "externalToolBuilder" directories have different names.

The layers of software on top of the server's operating system — quite probably the netatalk AFP software — interpret the prefix ":2e" as standing for ".". When presenting the underlying directory entry :2eexternalToolBuilders through AFP to my Mac, it rewrites the entry's name to .externalToolBuilders . It fails to notice that there is another entry named .externalToolBuilders in that directory. The result is that my Mac sees, in the original server, a directory with an unexpected, and rule-violating, duplication.

I suspect that the use of prefix ":2e" in place of prefix "." is a convention from old Server Message Block (SMB) file server software. SMB allowed Mac OS files to be stored on underlying Windows file systems. The Windows file system of the time did not permit filenames with a leading ".". "2e" can be read as a hex ASCII code for period ".". The colon ":" can be read as an escape character, meaning that it plus the following two hex digits should be presented as the character represented by the digits. Thus ":2e" in an underlying directory entry name stands for "." in the directory entry name presented by the server.

It turns out that the data on the original server was old enough to have been copied forward through multiple versions of server and server software. Sometimes I accessed it through SMB, and other times through AFP. I expect that the directory :2eexternalToolBuilders was created first, and the companion directory .externalToolBuilders was created later. They coexisted, especially for old data which I didn't access. Only when I used Finder to copy the directory did the conflict become apparent.

I speculate that the inconsistent behaviour I saw on my Mac, looking at the volume presented through AFP, is caused by Mac OS utilities treating directory entries differently depending on whether they look up a specific name in a directory, or enumerate all names. The software no doubt assumed there can be no duplicate names among the entries. Utilities looking for a specific name will find one or other of the duplicates, and stop. There is no reason to look for another of that name, because none should exist. Utilities enumerating all names, or all names matching a wildcard, return all matching entries, not caring that some are duplicates. The duplicate inode number can be explained by the software enumerating all names, then for each name, using that name to look up the inode number corresponding to that name. The software returning inode numbers would of course end its search with the same directory entry both times, because it was looking for the same name both times.

The solution for me was to patrol the underlying filesystem of my original server, looking for cases of duplicates separated by "." and ":2e" prefixes. I found about five cases, with names like .externalToolBuilders, .svn, .libs, and .metadata. I used shell command to merge all files into the entry with the "." prefix, then delete the entry with the ":2e" prefix. This removed the duplication, and let the Finder copy succeed.