MySQL 5.1 to 5.5: all UTF-8 characters lost!

MySQLunicode

I decided to upgrade my Debian installation, and after some days using MySQL 5.5, I found that mysql prints garbage to the terminal intead of UTF-8 characters.

I had a look at the SQL file that was produced by mysqldump, and noticed it already contained the garbage. Today I dropped the database and got a copy of the old /var/lib/mysql/database_name/* which was working well and, to my surprise, I still get the same results.

The file produced by mysqldump includes the following lines:

/*!40101 SET NAMES utf8 */;
/*!40101 SET character_set_client = utf8 */;
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

/var/lib/mysql/database_name/db.opt from the old and new databases are the same:

default-character-set=latin1
default-collation=latin1_swedish_ci

What can I do now? I hope I won't lose the entire database!

UPDATE

show session variables like 'character%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

SELECT charset(name), hex(name), name FROM table WHERE id='123';
+---------------+----------------------------------------------+----------------------------+
| charset(name) | hex(name)                                    | name                       |
+---------------+----------------------------------------------+----------------------------+
| latin1        | 457361C3BA204A6F73C3A92064652053616E74616E61 | Esaú José de Santana     |
+---------------+----------------------------------------------+----------------------------+

The output of name in the last command should be Esaú José de Santana

Best Answer

The MySQL output and the terminal work correctly; the problem is that the text in the database is labelled as latin1, but actually encoded as utf8.

This can be fixed by

  1. dumping the database into a Latin1 file, then importing that file as UTF-8: Fixing a MySQL Character Encoding Mismatch; or
  2. in MySQL, converting through binary to ignore the wrong encoding:

    UPDATE table SET column=CONVERT(CONVERT(column USING binary) USING utf8) WHERE id=123;
    

    Fixing column encoding mess in MySQL.

Related Question