I read on somewhere that Ubuntu will no longer use the familiar file size units we all know by now (kB, MB, GB, TB) and switch to a different IEC standard (KiB, MiB, GiB, TiB). If this is true, I would like to know what's the reasoning behind this change, and the impact (if any) this change has, especially with multiplatform applications or applications run with Wine.
Ubuntu – Will Ubuntu no longer measure file size unit as byte, megabyte, gigabyte, etc
filesystem
Related Solutions
Introduction:
Data in electronic computers is stored and transmitted in various ways, but they are always interpreted as a sequence of binary values, either 0 or 1. One binary value is called a bit. Eight bits is called an octet, or a byte. On this there is consensus.
A bit is denoted as b
, and a byte as B
. On this there is consensus, and if you ever spot an application breaking this convention, it's definitely a bug or an error. People frequently confuse the two, but application developers and manufacturers on the whole do not.
Once you get to larger units, there are two schools of thought, which sadly means that there is no consensus. Different operating systems and different applications belong to one school of thought or another.
Ubuntu's unit policy:
Ubuntu has a published units policy, which defines units like this.
The first set of units are multiples of 1024. (Why 1024? Because 1024 is 2 to the power of 10, which can make life easier for programmers.) This set of units is called binary units or the IEC prefixes, after the IEC standard that defined them:
- One kibibyte:
1KiB
= 1024 bytes (note the capital K) - One mebibyte:
1MiB
=1024KiB
= 1048576 bytes - One gibibyte:
1GiB
=1024MiB
=1048576KiB
= 1073741824 bytes
The second set of units are multiples of 1000. This aligns much more closely with commonly used units in the SI system, such as metres, litres and grams. A kilogram is 1000 grams; in the same way, a kilobyte is 1000 bytes. This set of units is called decimal units or the SI prefixes.
- One kilobyte:
1kB
= 1000 bytes (note the lowercase k) - One megabyte:
1MB
=1000kB
= 1000000 bytes - One gigabyte:
1GB
=1000MB
=1000000kB
= 1000000000 bytes
The traditional units:
Traditionally, many applications, operating systems and developers used binary units, giving them SI names. Ubuntu, GNOME and OS X all attempt to follow the published standards as explained previously, however, Microsoft Windows and many UNIX utilities still use these traditional units, so you need to be aware of them.
- One kilobyte:
1KB
= 1024 bytes (note the capital K) - One megabyte:
1MB
=1024KB
= 1048576 bytes - One gigabyte:
1GB
=1024MB
=1048576KB
= 1073741824 bytes
Traditionally, however, speeds are specified in bits per second, with SI prefixes! So 1Mbps is actually 1000000 bits per second, which is 125000 bytes per second, even on Microsoft Windows.
How to avoid ambiguity:
As you can see, these conflicting definitions lead to a lot of confusion. When I say 1MB
, do I mean a million bytes, or do I mean 1048576 bytes?
To avoid ambiguity, you should use one of these strategies:
- Exclusively use IEC prefixes.
1MiB
is always unambiguous. - Include a conversion to the number of bytes. eg: 1MB or 1000000 bytes
- Use both IEC and SI prefixes, eg: 1MiB or 1.048MB approx. I prefer this solution, as it makes it clear what you mean, and it the reader doesn't have to perform any mental calculations.
Where there is ambiguity, here's a good set of rules of thumb that has served me well:
- If you spot
KB
(with a capital K), then the traditional units are probably being used. - If you spot
kB
(with a lowercase k), then the SI units are probably being used. - If the number is describing a speed, then decimal units are probably being used.
- If the number is on OS X, on modern Ubuntu or GNOME applications, then decimal units are probably being used.
- If the number is on a hard drive or another piece of computing equipment, then decimal units are probably being used.
- If the number is from a command-line utility on Linux, then traditional binary units are probably being used.
- If the number is from a Microsoft Windows application, then traditional binary units are probably being used.
When it comes to Ubuntu applications, have a look a this list specifying which applications use which system.
References:
Intro:
Before reading this answer, make sure you understand what the different units and the different systems are. This is a quick reminder.
The IEC units are 1KiB (1024 bytes), 1MiB, 1GiB and so on. The SI units are 1kB (1000 bytes), 1MB, 1GB and so on. Ubuntu's unit policy mandates that IEC and SI units be used only.
The traditional units are 1KB (1024 bytes), 1MB, 1 GB and so on. Ubuntu's unit policy only allows using them for backwards compatability.
I'm making this answer a community wiki, so please add applications to the list and keep it up-to-date!
Graphical applications:
- GNOME System Monitor follows Ubuntu's unit policy, using IEC units.
- Nautilus follows Ubuntu's unit policy, using SI units. The properties dialog helpfully converts into bytes as well.
- GNOME Disks follows Ubuntu's unit policy, using SI units, helpfully converting into bytes as well.
- Disk Usage Analyser (baobab) follows Ubuntu`s unit policy, using SI units.
- GParted follows Ubuntu's unit policy, using IEC units.
- system-config-lvm uses the traditional system. (bug report)
- Firefox uses the traditional system. (bug report)
Command-line applications:
ls
uses the traditional system, but has an--si
option.du
uses the traditional system, but has an--si
option.df
uses the traditional system, but has an--si
or-H
option.fdisk
uses the traditional systemfparted
uses the traditional systemlvextend
and other commands belonging to the LVM allow you to choose between binary and decimal units, depending on whether you use lower-case or upper-case letters. See the man page forlvs
, for example. The man pages aren't as clear as they could be. (bug report)
Best Answer
Short answer is yes, the prefixes change. But it doesn't really make a difference.
Reasoning
There has always been confusion because decimal-style units like KB, MB, GB were used with binary data - KB meant 1024 bytes, not 1000 bytes as might be expected. And of course many people throughout the world use the actual decimal prefixes in their daily lives under the metric system.
Network engineers and long-time computer users of course are trained to understand the difference, but the ongoing confusion meant applications were inconsistent in their usage; one application might use MB to mean 1,000,000 bytes (using the decimal prefix), while another might mean 1,048,576 bytes (using the binary interpretation).
This led to Ubuntu eventually adopting a new units policy.
Impact
The impact is really just a display issue. File sizes and network bandwidth will be displayed using the decimal prefixes, so a 5kB file will actually be 5000 bytes. This is actually in line with what many (most?) people expect.
Memory usage and some low-level utilities will display sizes using the binary prefixes (KiB, MiB, GiB, TiB). This may cause some initial confusion but is actually better than the status quo where we have one prefix meaning two different things.
Since Windows still uses the old, ad-hoc system a Wine application might display slightly different file sizes for the same file. However I at least often see different sizes displayed anyway due to rounding methods, so I'm not convinced it's a major issue.
See also: