Hardlinks seem to take several hundred bytes just for the link itself (not file data)

disk-usagehard link

Additional Info

Firstly, thank you for all the answers.

So I re-ran the tests again to test the answer below that a directory/folder entry takes up 4KB and this was skewing my numbers, so this time by placing 20,000 files in a single directory and doing the cp -al to another directory. The results were very different, after taking off the length of the filenames, the hardlinks worked out to about 13 bytes per hardlink, much better than 600. Ok, so then for completeness working on the answer given below that this is due to each entry for a directory/folder taking up 4KB I did the test again, but this time I created thousands of directories and placed one file in each directory. The result after the maths (increased space taken on hd / by number of files (ignoring directories) was almost exactly 4KB for each file, showing that a hardlink does only take up a few bytes but that an entry for an actual directory/folder takes 4KB.

So I was thinking of implementing the rsync / hardlink /snapshot backup strategy and was wondering how much data a hardlink took up, like it has to put an entry for the extra link as a directory entry etc. Anyway I couldn't seem to find any information on this and I guess it is file system dependent. The only info I could find was suggestions they took no space (probably meaning they take no space for file contents), to the space they take is negligible to they only take a few bytes to store the hardlink.

So I took a couple of systems (one a vm and one on real hardware) and did the following in the root directory as root:

mkdir link
cp -al usr link

The usr directory had about 54,000 files. The space used on the hd increased by about 34MB. So this works out around 600 bytes per hardlink, or am I doing something wrong?

I am using LVM on both systems, formatted as ext4.

The file name size is about 1.5MB altogether (I got that by doing ls -R and redirecting it to a file).

To be honest, the rsync with hardlinks works so well I was planning on using it for daily backup on a couple of the work servers. I also thought it would be easy to make incremental backups / snapshots like this for a considerable period of time. However, after ten days 30mb is 300mb and so on. In addition if there have only been a few changes to the actual file data/contents, say a few hundred KB then storing 30+ MB of hardlinks per day seemed excessive, but I take your point about the size of modern disks. It was simply that I had not seen this hardlink size mentioned anywhere that I thought I may be doing something wrong. Is 600 bytes normal for a hardlink on a Linux OS?

To calculate the space used I did a df before and after the cp -al.

Best Answer

cp -al usr link creates a bunch of hard links, but it also creates some directories. Directories can't be hard linked¹, so they're copied.

Each hard link occupies the space of a directory entry, which needs to store at least the file's name and the inode number. Each directory occupies the space of a directory entry, plus an inode for its meta data. Most filesystems, including the ext2 family, count inode space separately. All the hard links are in directories created by the copy operation. So the space you're seeing is in fact the size of the directories under /usr.

In most filesystems, each directory occupies at least one block. 4kB is a typical block size on Linux. So you can expect the copy to take 4×(number of directories) in kB, plus some change for the larger directories that require multiple blocks. Assuming 4kB blocks, your copy created about 8500 blocks, which sounds about the right ballpark for a /usr directory containing 54000 files.

Directories must have exactly one parent directory. They in fact do have hard links (or at least appear so, though modern filesystems tend not to use hard links under the hood): one for their entry in their parent, one for their . entry, and one for the .. entry in every subdirectory. But you can't make other hard links to them. Some Unix variants allow root to make hard links to directories on some filesystems but at the risk of creating loops that can't be removed or hidden directory trees that can't be accessed.

Related Question