Mysql – How do MariaDB and MySQL differ in character set handling

character-setmariadbMySQLutf-8

In MySQL/MariaDB, I set the connection character set to UTF-8:

SET NAMES 'utf8mb4';
SET CHARACTER SET 'utf8mb4';

There are some tables with a latin1_swedish_ci collation and latin1 encoded data in them. SELECTing from them gives me correct UTF-8 data.

When I once incorporated SET NAMES, SET CHARACTER SET in my program startup code, I assumed that I could give UTF-8 encoded data in INSERT strings to the MySQL server and it would convert it according to the table/column collation. That assumption came from me testing locally with MariaDB, where it worked. Then on the production site with MySQL: The UTF-8 data would not be converted to latin1 but saved as UTF-8 in Latin1 column. To fix that, I always encode values explicitly to latin1 when INSERTing into latin1 table.

Now, I revisited some code that did not SET NAMES, SET CHARACTER SET. I changed it to SET it. Now the MariaDB/MySQL difference strikes again. What is the difference between MariaDB and MySQL character set handling?

Best Answer

MySQL and MariaDB handle character sets the same. However... Different versions of MySQL have different defaults. Ditto for MariaDB.

SET NAMES declares what encoding is in the client. This is independent of the CHARACTER SET on a column or table. (There are, of course, characters that are not representable in latin1, so there can be conversion problems.) As you say, the conversion occurs on INSERT or SELECT.

To further pursue this, please tell us what client you are using, what connection parameters were used, any SETs performed, and what happened.

For diagnosing what is stored, use SELECT col, HEX(col).... More details.