Ubuntu – Unable to connect to a wifi network with an SSID containing diacritics

androidbroadcomcharacter encodingUbuntuwifi

My current WiFi network (and I'm writing this with the help of Windows -.- ) has a diacritic in its SSID: "ö".

If changing the SSID is not an option, how to connect to such a network?

This does not work with Ubuntu 12.04 or Android 2.3.6 or Android 4.0, so I guess it's a general Linux problem. The network adapter is a "Broadcom 4313".

The problem seems to be, that the network is hidden. Both Android and Ubuntu fail to interpret the manually entered SSID string correctly. However both systems can see the network, if it's not hidden. Ubuntu sees the SSID name correctly, Android fails at the diacritic "ö" and misses the following two characters too, (so instead of "[some characters]örc[some other characters]" it sees "[some characters] [some other characters]").

So it's the combination of hidden SSID and special-character-SSID that causes the problem.

Best Answer

I'm guessing this has to do with encoding. According to this answer, an SSID may (now) have an explicit UTF-8 or UNSPECIFIED encoding, but the "SSIDEncoding" field is part of a newer standard. Presumably then on networks with equipment older than this, it is effectively "unspecified".

I would like to think that anything which sets an SSID from text input by humans would do it in ASCII or UTF-8. However, the standard specifying the SSIDEncoding field appears to be dated 2012, so previous to that, any encoding at all could be used (and any encoding at all could still be used, as UNSPECIFIED). So there could be some software somewhere that sets them in something else -- e.g., using ASCII but falling back on UTF-16 when a string contains odd characters. Java and I believe Windows both use UTF-16 internally.

The router almost certainly regards the SSID as a sequence of bytes and does not care about any potential encoding at all, so it will accept whatever it was passed when the SSID is set. To determine this, you'd have to look at the actual byte sequence as broadcast.

both systems can see the network, if it's not hidden.

It's possible to recognize a UTF-16 string for what it is, so when not hidden, that may happen and the SSID translated into local encoding for display. But when you try to enter it manually, the system can't know what encoding to use in the broadcast; it will only work if it matches the methodology of the software that set it in the first place. The new SSIDEncoding field does potentially resolve this, but A) it also allows for the UNSPECIFIED loophole, and B) older equipment won't care. Since linux and android generally use UTF-8, if the SSID is actually a UTF-16 string, it may end up looking the same on the screen, but not match up when entered manually and searched for.

Related Question