Mysql – German umlaute represented by questionmarks when latin1 is used

character-setmariadbMySQL

I'm almost certain this has come up before but I couldn't find anything regarding it.

My problem is that I've a database and a database server (and every table and column inside) set to latin1/latin1_general_ci.

Curiously when I connect with a client (for example HeidiSQL) client specific variables regarding character sets are set to utf8mb4, even more curious: äüöß are shown correctly.

When I execute SET NAMES latin1; and check again, the variables are set to latin1, as they should be, but the Umlaute are now displayed as question marks. This is also true if they where inserted AFTER SET NAMES latin1; was executed.

Another database server is completely set to utf8, except the database my program works with, which is again set to latin1. If I execute SET NAMES latin1; there it works as expected.

Could someone explain/tell me what is going on here? What am I doing wrong?

The variables I'm talking about are:

  • character_set_client [Global/Default: latin1]
  • character_set_connection [Global/Default: latin1]
  • character_set_database [Global/Default: latin1]
  • character_set_results [Global/Default: latin1]
  • character_set_server [Global/Default: latin1]
  • collation_connection [Global/Default: latin1_general_ci]

After some toying around it seems like character_set_results is the problematic one here. Setting it to latin1, while leaving the other ones at utf8mb4, results in question marks being displayed. Still no idea why this happens.

edit

Result of SHOW CREATE TABLE adressen:

CREATE TABLE `adressen` (
  `ID` int(9) unsigned NOT NULL AUTO_INCREMENT,
  `Strasse` longtext,
  `Strasse2` longtext,
  `Hausnummer` longtext,
  `Postleitzahl` longtext,
  `Ort` longtext NOT NULL,
  `Postfach` longtext,
  `IDLand` int(9) unsigned NOT NULL,
  `Laengengrad` float NOT NULL,
  `Breitengrad` float NOT NULL,
  `Adresszusatz` longtext,
  PRIMARY KEY (`ID`),
  KEY `FK_Adressen_Land` (`IDLand`),
  CONSTRAINT `FK_Adressen_Land` FOREIGN KEY (`IDLand`) REFERENCES `laender` (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=137205 DEFAULT CHARSET=latin1

SELECT STRASSE, HEX(STRASSE) FROM adressen WHERE STRASSE LIKE 'ä%':
Ä[…];C4[…]

According to Wikipedia's article about ISO8859 that's latin1 alright.

edit2

After some further reading I found that both, HeidiSQL and .NET handle text in utf8 / unicode [at least by default]. The solution to this question is thus to convert everything to utf8mb4_unicode_ci as per this answer.

Best Answer

See "Question Marks" in here . In particular,

  • The bytes to be stored are not encoded as utf8/utf8mb4. Fix this.
  • The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
  • Also, check that the connection during reading is UTF-8.

Since you seem to be aiming for latin1, not utf8, see "Test the data" in that link. äüöß will be hex E4 FC F6 DF in latin1, or C3A4 C3BC C3B6 C39F in utf8/utf8mb4.

SET NAMES establishes the encoding in the client, not the server. The table's encoding is based on SHOW CREATE TABLE; let's see that.

OR...

It is OK if you want latin1 used in the columns/tables, but your client wants utf8. Then the connection setting (to say utf8 / UTF-8) or the use of SET NAMES utf8 (after connecting) will alert MySQL to convert as data is transferred between the client and the server and database.

(With latin1, you are limited to Western European characters.)