Ubuntu – Why do du -sh and the file manager disagree

disk-usagefilemanager

I would like to know why the directory sizes I get when I execute du -sh disagree from the ones the file manager shows. What do they do differently and how big is my data really? I am not so much interested in the size it takes on the disk (because of blocks and stuff), I just want to know how big the actual data is.

Best Answer

Short answer: The file manager calculates with units based on 1000, du calculates per default with units based on 1024. Because of this, the file manager views a file of 1024 bytes as "1.024 kB", while du views it as "1.000 kiB". This (literally) multiplies if you think of larger files, for example Megabyte (1000 * 1000) vs. Mibibyte (1024 * 1024) or Gigabyte (1000 * 1000 * 1000) vs. Gibibyte (1024 * 1024 * 1024).

Long answer: The difference stems from the different ways computers and humans count. Most current human societies do their math with the decimal system, on the base of 10. Not all cultures in history did, that's why, for example, we divide the day into 24 hours. But with most things, we use 10, or multiples of 10, or 10 to the n-th power. This is evident in the International system of Units (SI), which uses prefixes to mark 10 ^ 3 = 1000. 1000 gram equals 1 *kilo*gram, the 1000th part of 1 meter equals 1 *milli*meter and so fort. Technical, 1000 kilogram would be 1 "megagram", but traditionally, we use a different word for it, "ton". Still, it is based on 1000.

On the other hand, computers calculate not based on 10, but based on 2 - on/off, power/no power, true/false. Therefore, computers use multiplies and powers of 2 instead of multiplies of 10: 2, 4, 8, 16, 32, 64 and so fort. The power of 2 which is nearest to 1000 is 1024. Because of this, "1 kilobyte" was defined originally not as "1000 byte" as most other units would have been, but as "1024 byte". In the same way, "1 megabyte" originally was "1024 * 1024 byte", "1 gigabyte" originally was "1024 * 1024 * 1024 byte" and so forth.

"Back then", most people who used computers knew about this, and in the order of scales that were used in those days, it didn't make munch of a difference. Whether a file is 1000 bytes "large" or 1024 bytes, doesn't really matter in most cases. But time went on, computers became omnipresent, and the numbers became larger. Today, many computer users don't know about 1000 vs. 1024, or they don't care. It doesn't make too much sense to explain to "Joe Everbody", that with almost everything, "kilo" means "1000 of it", but with computers, it's different. Additionally, the difference starts to get significant. If you compare a "Gigabyte" based on 1000 to a "Gigabyte" based on 1024, the difference is roughly 10%. With "Terabyte" and larger, the difference is an even larger fraction.

Therefore, over the last years many countries decided to differentiate between those two calculation systems. The classical prefixes kilo-, mega-, giga-, tera- etc. are today almost always used based on 1000. So, a file with 1024 bytes would no longer be "1.000 kilobyte", but "1.024 kilobyte". The units based on 1024 got new prefixes, with the first syllable of the "old one" followed by "bi": Kilo -> kibi, mega -> mibi, tera -> tebi and so forth. The symbols are KiB, MiB, TiB and so forth.

Nautilus, Ubuntu's file manager, calculates based on 1000. So it shows your file sizes in kilobytes, megabytes etc. du on the other hand still calculates based on 1024. So with du you see your file sizes in kibibytes, mebibytes etc. And as said above, once we are in the tera- vs. tebi- range and up, it starts to show ;)

du offers the --si switch. It works like -h, but calculates with SI units instead of based on 1024. So

du --si -s my_files/

would give you a size in KB, MB, GB etc., while

du -sh my_files/

would give you a size in KiB, MiB, GiB etc.