ext[234] by default reserves 8192 inodes per 128 mb block group, which takes 2mb, per group, which works out to close to 1gb for a 60 gb filesystem. There should be no difference when you mount the drive in another system. It looks like they may have changed the way the kernel reports used space between wheezy and squeeze, though I have not yet found a commit indicating this was done on purpose.
cp -al usr link
creates a bunch of hard links, but it also creates some directories. Directories can't be hard linked¹, so they're copied.
Each hard link occupies the space of a directory entry, which needs to store at least the file's name and the inode number. Each directory occupies the space of a directory entry, plus an inode for its meta data. Most filesystems, including the ext2 family, count inode space separately. All the hard links are in directories created by the copy operation. So the space you're seeing is in fact the size of the directories under /usr
.
In most filesystems, each directory occupies at least one block. 4kB is a typical block size on Linux. So you can expect the copy to take 4×(number of directories) in kB, plus some change for the larger directories that require multiple blocks. Assuming 4kB blocks, your copy created about 8500 blocks, which sounds about the right ballpark for a /usr
directory containing 54000 files.
Directories must have exactly one parent directory. They in fact do have hard links (or at least appear so, though modern filesystems tend not to use hard links under the hood): one for their entry in their parent, one for their .
entry, and one for the ..
entry in every subdirectory. But you can't make other hard links to them. Some Unix variants allow root to make hard links to directories on some filesystems but at the risk of creating loops that can't be removed or hidden directory trees that can't be accessed.
Best Answer
Assuming there aren't internal hardlinks (that is, every file with more than 1 hardlink is linked from outside the tree), you can do:
EDIT And here is what I sketched in the comment, applied. Only without
du
; kudos to @StephaneChazelas for noticingdu
is not necessary. Explanation at the end.What we do is to create a string with the disk usage (in KB) of every relevant file, separated by plus signs. Then we feed that big addition to
bc
.The first
find
invocation does that for directories.The second
find
prints link count, inode, and disk usage. We pass that list throughsort | uniq -c
to get a list of (number of appearances in the tree, link count, inode, disk usage).We pass that list through
awk
, and, if the first field (# of appearances) is greater than or equal the second (# of hardlinks), meaning there aren't links to this file from outside the tree, then print the fourth field (disk usage) with a plus sign and a backslash attached.Finally we output a
0
, so the formula is syntactically correct (it would en in+
otherwise) and pass it tobc
. Phew.(But I would use the simpler first method, if it gives a good enough answer.)