I have copied a folder using rsync
including symlinks, hard links, permissions, deleting files on destination et cetera. They should be pretty identical.
One folder is on a USB drive and the other on a local disk.
If I run: du -bls
on both folders, the size comes up as slightly different.
My du
supports --apparent-size
and it is applied by -s
and -l
should count the content of the hard links.
How can this difference be explained and how do I get the actual total?
Both file systems are ext4, the only difference is that the USB drive is encrypted.
EDIT:
I digged down to find the folders that were actually different, I found one and the content is not special (no block device, no pipes, no hard or symlinks, no zero bytes files), the peculiarity may be having several small files within it. The difference is 872830 vs 881022 of this particular folder.
I also ran du -blsc
in both folders and the result is the same in this case.
Some extra details on the commands I used:
$ du -Pbsl $LOCALDIR $USBDIR | cut -f1
872830
881022
$ du -Pbslc $LOCALDIR/*
[...]
868734 total
$ du -Pbslc $USBDIR/*
[...]
868734 total
$ ls -la $USBDIR | wc
158 1415 9123
$ ls -la $LOCALDIR | wc
158 1415 9123
$ diff -sqr --no-dereference $LOCALDIR $USBDIR | grep -v identical
[No output and all identical if I remove the grep]
Best Answer
Since you have copied the files using
rsync
and then compared the two sets of files usingdiff
, and sincediff
reports no difference, the two sets of files are identical.The size difference can then probably be explained by the sizes of the actual directory nodes within the two directory structures. On some filesystems, the directory is not truncated if a file or subdirectory is deleted, leaving a directory node that is slightly larger than what's actually needed.
If you have, at some point, kept many files that were later deleted, this might have left large directory nodes.
Example:
Notice how, even though I deleted the 1000 files I created, the
dir
directory still uses 20 KB.