Do ZIP files preserve all HFS+ features, when created using Finder’s Compress command

backupfinderzip

Is it a good idea to backup a directory on a HFS+ filesystem by using the Finder's Compress function, and then copying the ZIP file to a FAT32 hard disc, or to Dropbox, etc? Or might that cause data corruption or data loss?

For example, if I compress my iTunes library, and symlinks are replaced by another copy of the file, that's a change in the semantics. If my hard disc crashes, and I restore a copy of the iTunes library from backup, iTunes may not work properly because of this. For example, changing the contents of one file won't affect the other. Deleting the file being pointed to means that you can no longer read the contents of that file via the symlink, which again is different if the symlink is replaced by a copy of that file. iTunes may crash given a corrupt library, or further corrupt the library, which means that the backup failed to serve its purpose.

Is it guaranteed that all valid directories compress to ZIP without errors, and expand to an identical copy of the original directory, without any loss of information or semantic changes? More specifically, do ZIP files support all HFS+ features?

  1. Symlinks
  2. Hard links (including to directories, which are supported on HFS+)
  3. Aliases
  4. Extended attributes
  5. Resource forks
  6. ACLs
  7. Unix permissions
  8. All valid path names in HFS+. In other words, does ZIP support all characters that are valid to use in a path name? Does ZIP support the longest path name you can create in HFS+, or is there a lower path length limit in the ZIP format?
  9. Is there a 4GB file size limit?

… and so on.

I'm concerned about the potential for silent changes, which cause silent data loss or corruption without me being aware of it until it's too late.

This is a question about the ZIP format, and also about the Finder's Compress command. Because even if the ZIP format supports something, if the Finder's implementation doesn't, it doesn't help.

Best Answer

No credit for me. I have literally taken this from user NSGod: https://superuser.com/a/222590

My own addition, a bit obvious:

  • the target filesystem should support symlinks, hardlinks, etc, else it will not work
  • you can use the zip command in the command line and preserve these filetypes manually with zip --symlinks -r foo.zip foo/

How are you creating the .zip archive in OS X? (Using a command-line tool, and if so, which one, or by using Archive Utility, etc.)

What operating system is the target computer (where the archive will be unzipped) running, and what method are you using to unzip the file there?

First of all, in OS X, symbolic links are basically plain text files with some extra "Mac" information that lets OS X know that it should treat the file as a symbolic link. This extra Mac information is a special file type, creator code, and Finder flag information, which is stored not in the file itself, but in the HFS+ disk directory.

In OS X, when you create a .zip file, there is no room in the zip stream for this extra Mac information, so, in a sense, the symbolic link is stored within the zip file as a plain file. Whether someone on another Mac can unzip the archive and have it properly represent the original structure appears to depend on who or what you use to unzip it.

For example, a month or so ago a company released a game over Steam, Valve's game distribution software. The game application bundle included NVIDIA's Cg library in the form of a framework, which internally use symbolic links. There was originally a problem with Steam not properly restoring the necessary Mac information on Trine.app as mentioned in this thread: http://forums.steampowered.com/forums/showthread.php?t=1556083

The image below shows 2 different copies of the Cg.framework, one I had installed separately from NVIDIA's website (upper image), and the lower image shows what was received with the game:

alt text

Notice that all the items match up, but what should be symbolic links are plain data files.

After taking a closer look at the FSCatalogInfo record for both the items, it was clear to me what the issue was:

alt text

You'll notice in the upper image that the beginning of the finderInfo struct has the following values:

0x736C6E6B = 'slnk'
0x72686170 = 'rhap'

These values are defined in /usr/include/hfs/hfs_format.h:

/*
 *  File type and creator for symbolic links
*/
enum {
    kSymLinkFileType  = 0x736C6E6B, /* 'slnk' */
    kSymLinkCreator   = 0x72686170  /* 'rhap' */
};

The 9th byte value, 0x80, corresponds to the kIsAlias flag of the finderInfo.finderFlags. That value is defined in /System/Library/Frameworks/CoreServices.framework/.../CarbonCore.framework/.../Headers/Finder.h:

enum {
    kIsAlias                      = 0x8000 /* Files only */
};

It appears that unzip feature built-in to OS X (Archive Utility) is hard-coded to look for possible files in the archive that's being unzipped that represent symbolic links, and to set the information appropriately. I believe that /usr/bin/ditto (when used for its ability to archive files) also takes care of this for you. I'm not sure whether zip or unzip do.