Linux – How does du determine which hard link to disregard

dufilesystemslinux

We have two directories:

$ ls -l
total 8
drwxr-x--- 2 nimmy nimmy 4096 Nov 15 19:42 jeter
drwxr-x--- 2 nimmy nimmy 4096 Nov 15 19:42 mariano

I create one file in the first folder:

$ dd if=/dev/zero of=jeter/zero_file.1 bs=512000 count=1
1+0 records in
1+0 records out
512000 bytes (512 kB) copied, 0.268523 s, 1.9 MB/s

This is the output of du:

$ du -sh *
504K    jeter
4.0K    mariano

As expected, if I place a hard link of the zero_file. in the other folder du output does not change:

$ ln jeter/zero_file.1 mariano/zero_file.2
$ du -sh *
504K    jeter
4.0K    mariano

However, as far as I'm aware, there is nothing in the filesystem that points to zero_file.1 as the original file. So how does du know to count zero_file.1 but not zero_file.2?

It cannot be a timestamp comparison because all hard links share one inode; they'll have the same timestamp data correct?

Best Answer

Extending your test to three folders, you can see that only the first time the inode is hit does du count it. If you go into the individual folder and run du, you'll get the full size.

To test:

mkdir alexandru
ln mariano/zero_file.2 alexandru/zero_file.0
du -sh *

You should now see alexandru taking up the 500K+. So without looking at the du code, I'm guessing it stores a list of traversed inodes and doesn't revisit the ones already seen.

Related Question