When installing Linux what factors go into choosing the locale for the server

character encodinglocale

When I am installing Linux (for the GB locale) I am presented with the option of choosing en_GB, en_GB.UTF-8 and en_GB.ISO-8859-15.

What factors go into making the choice? As far as I know the British alphabet doesn't use UTF-8, or it does but I haven't experienced or recognized what problems that causes on a server.

Is there some way to tell which may be more appropriate for one's case? I know that databases installations like Postgres, MySQL and SQLite seem to prefer the UTF locale.

Best Answer

The difference between these options is what character encoding is used for text. If you choose en_GB the system will use the iso8859-1 character set. Iso8859-15 is roughly equivalent to iso8859-1, but eight code points have changed meaning; for example, the currency symbol ¤ has been replaced by the Euro sign €. These encodings use 8 bits per character, and are thus limited to fixed sets of 256 different characters (even less in practice).

UTF-8 is a Unicode encoding. Unicode is the all-encompassing character representation scheme, defining code points for more than 128000 characters and emojis. Unicode definitely also supports the British alphabet.

My recommendation is to use UTF-8, because it is a superset of the other character sets and is widely used on Linux today.