Linux – git-like file system

filesystemsgitlinuxrubyunix

Git stores content uniquely in its repo based on the calculated hash of any file. If my directory has two copies of the same file somewhere inside it, git will only actually store it once.

I am wondering if this same concept has been implemented at the operating-system level as some kind of file system?

If a file system acted this way by default it would nicely help with dll hell issues. Essentially, it would symlink automatically on your behalf. Any application could be packaged (like a jar) in a directory with all of its dependencies and no extra storage cost.

Ruby enthusiasts share libraries by publishing them as rubygems. Still, this effort to share gems resulted in deployment nightmares that lead to the Vendor Everything concept of copying all dependencies into local folders to avoid such nightmares.

Best Answer

What you're looking for is called "deduplication". While it's usually implemented by vendors of specialized storage products, the ZFS filesystem implements it as well. Most Unix-derived operating systems can make use of ZFS, and I'd therefore recommend it as the first place to look.

Related Question