Mysql – When are we supposed to set default-character-set for the client

backupMySQLmysqldump

I'm trying to understand when I'm supposed to set/force default-character-set for the client.

The documentation states:

You can force client programs to use specific character set as
follows:

[client] default-character-set=charset_name

This is normally unnecessary. However, when character_set_system
differs from character_set_server or character_set_client, and you
input characters manually (as database object identifiers, column
values, or both), these may be displayed incorrectly in output from
the client or the output itself may be formatted incorrectly. In such
cases, starting the mysql client with
–default-character-set=system_character_set—that is, setting the client character set to match the system character set—should fix the
problem.

Let's assume charater_set_system and character_set_server are different.

My first question:

I suppose that when I'm importing an export files generated by mysqldump the variable default-character-set has no effect as set names is always issued at the very beginning of the export file.
Am I right or there could be corner cases?

The second question:

What about manually running sql scripts? The documentations states
that some characters may be displayed incorrectly. How can I be sure that the data is imported correctly (a part from running an application test i.e. by trial and error)?

Best Answer

You have the option of setting character sets/collations on the fly:

While trial-and-error may be necessary, don't go willy-nilly on these variables in /etc/my.cnf. You are better off setting them dynamically during any trial-and-error testing.

To make sure of any corner cases, look at the initial variables of any mysqldump and see if character sets or collations are set in the beginning and reset at the bottom.

In fact, here is a sample for a mysqldump:

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

See the variables

  • @OLD_CHARACTER_SET_CLIENT
  • @OLD_CHARACTER_SET_RESULTS
  • @OLD_COLLATION_CONNECTION
  • SET NAMES is hardwired to utf8

You could perhaps set these options

  • when calling the mysqldump
  • place these options in /etc/my.cnf under [mysqldump] group section
  • edit these values for existing mysqldumps using perl, awk, etc.