Mysql – Why is there a MySQL character encoding issue after database transfer

MySQL

I understand that there are probably hundreds of questions here addressing this issue but I haven't really found anything that worked yet.

The Scenario

I built a site on our staging site using WordPress. WordPress now uses utf8mb4 in the newer installations. Knowing this I checked the MySQL version on the live site to ensure compatibility and noticed it was using MySQL version 5.1.x.

I upgraded mysql on the live site following this tutorial with zero problems: https://kb.plesk.com/en/125589

At this point I do an export of the staging site database, replace all the staging site urls to the new live site urls and then import. Site runs fine until I see some diamond characters with question marks in them. I believe there is some character encoding issue at this point.

Just in case it wasn't a bad export I imported it locally on a MySQL version 5.5.3 on OSX. The import works fine. No character encoding issue.

I could be wrong but it might be treating the uft8mb4 encoding as utf8 on the live site.

Question(s)

If the export is working on my localhost with a similar MySQL version what could be causing the database on the live site to be spitting out missing characters?

Additional Info

Live Site MySQL version and status (encoding issues with import)

mysql  Ver 14.14 Distrib 5.5.52, for Linux (x86_64) using readline 5.1

Connection id:      20682
Current database:   
Current user:       admin@localhost
SSL:            Not in use
Current pager:      stdout
Using outfile:      ''
Using delimiter:    ;
Server version:     5.5.52-cll-lve MySQL Community Server (GPL) by Atomicorp
Protocol version:   10
Connection:     Localhost via UNIX socket
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    utf8
Conn.  characterset:    utf8
UNIX socket:        /var/lib/mysql/mysql.sock
Uptime:         16 hours 11 min 18 sec

Local mysql version and status (imports fine)

mysql  Ver 14.14 Distrib 5.5.33, for osx10.6 (i386) using  EditLine wrapper

Connection id:      4402
Current database:   
Current user:       root@localhost
SSL:            Not in use
Current pager:      stdout
Using outfile:      ''
Using delimiter:    ;
Server version:     5.5.33 Source distribution
Protocol version:   10
Connection:     Localhost via UNIX socket
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    utf8
Conn.  characterset:    utf8
UNIX socket:        /Applications/MAMP/tmp/mysql/mysql.sock
Uptime:         14 days 8 hours 3 min 6 sec

Staging mysql version and status (Export source; site works fine)

VERSION is 5.5.5 (I couldn't get the status info of this one)

SQL dump file (a few tables only with some settings)

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

DROP TABLE IF EXISTS `wp_blc_filters`;

CREATE TABLE `wp_blc_instances` (
  `instance_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `link_id` int(10) unsigned NOT NULL,
  `container_id` int(10) unsigned NOT NULL,
  `container_type` varchar(40) NOT NULL DEFAULT 'post',
  `link_text` varchar(250) NOT NULL DEFAULT '',
  `parser_type` varchar(40) NOT NULL DEFAULT 'link',
  `container_field` varchar(250) NOT NULL DEFAULT '',
  `link_context` varchar(250) NOT NULL DEFAULT '',
  `raw_url` text NOT NULL,
  PRIMARY KEY (`instance_id`),
  KEY `link_id` (`link_id`),
  KEY `source_id` (`container_type`,`container_id`),
  KEY `parser_type` (`parser_type`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `wp_postmeta`;

CREATE TABLE `wp_postmeta` (
  `meta_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `post_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `meta_key` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `meta_value` longtext COLLATE utf8mb4_unicode_ci,
  PRIMARY KEY (`meta_id`),
  KEY `post_id` (`post_id`),
  KEY `meta_key` (`meta_key`(191))
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Output of SHOW FULL COLUMNS FROM wp_posts;

ID                      bigint(20)      NULL    
post_author             bigint(20)      NULL    
post_date               datetime        NULL    
post_date_gmt           datetime        NULL    
post_content            longtext        utf8mb4_unicode_ci  
post_title              mediumtext      utf8mb4_unicode_ci  
post_excerpt            mediumtext      utf8mb4_unicode_ci  
post_status             varchar(20)     utf8mb4_unicode_ci  
comment_status          varchar(20)     utf8mb4_unicode_ci  
ping_status             varchar(20)     utf8mb4_unicode_ci  
post_password           varchar(20)     utf8mb4_unicode_ci  
post_name               varchar(200)    utf8mb4_unicode_ci  
to_ping                 mediumtext      utf8mb4_unicode_ci  
pinged                  mediumtext      utf8mb4_unicode_ci  
post_modified           datetime        NULL
post_modified_gmt       datetime        NULL
post_content_filtered   longtext        utf8mb4_unicode_ci 
post_parent             bigint(20)      NULL               
guid                    varchar(255)    utf8mb4_unicode_ci 
menu_order              int(11)         NULL               
post_type               varchar(20)     utf8mb4_unicode_ci 
post_mime_type          varchar(100)    utf8mb4_unicode_ci 
comment_count           bigint(20)      NULL               

I found some info that was different between the mysql servers when I ran the query SHOW VARIABLES LIKE 'char%'. Here are the results below if this helps distinguish the problem.

Local mysql

character_set_client        utf8
character_set_connection    utf8
character_set_database      utf8
character_set_filesystem    binary
character_set_results       utf8
character_set_server        latin1
character_set_system        utf8

Staging mysql

character_set_client        utf8
character_set_connection    utf8
character_set_database      latin1
character_set_filesystem    binary
character_set_results       utf8
character_set_server        latin1
character_set_system        utf8

Live mysql (remember this is the one having issues)

character_set_client        utf8mb4
character_set_connection    utf8mb4
character_set_database      utf8mb4
character_set_filesystem    binary
character_set_results       utf8mb4
character_set_server        latin1
character_set_system        utf8

This is the diff of the SHOW VARIABLES command in mysql for my live and local environments

Local Environment

-| basedir                                           | /Applications/MAMP/Library                                                                                             |
-| character_sets_dir                                | /Applications/MAMP/Library/share/charsets/                                                                             |
-| datadir                                           | /Applications/MAMP/db/mysql/                                                                                           |
-| general_log_file                                  | /Applications/MAMP/db/mysql/Joes-Mac.log                                                                               |
-| have_openssl                                      | NO                                                                                                                     |
-| have_ssl                                          | NO                                                                                                                     |
-| have_symlink                                      | YES                                                                                                                    |
-| hostname                                          | Joes-Mac.local                                                                                                         |
-| innodb_additional_mem_pool_size                   | 8388608                                                                                                                |
-| innodb_buffer_pool_size                           | 134217728                                                                                                              |
-| innodb_log_buffer_size                            | 8388608                                                                                                                |
-| innodb_thread_concurrency                         | 0                                                                                                                      |
-| innodb_use_native_aio                             | OFF                                                                                                                    |
-| innodb_version                                    | 5.5.33                                                                                                                 |
-| lc_messages_dir                                   | /Applications/MAMP/Library/share/                                                                                      |
-| local_infile                                      | ON                                                                                                                     |
-| log_error                                         | /Applications/MAMP/logs/mysql_error_log.err                                                                            |
-| lower_case_file_system                            | ON                                                                                                                     |
-| lower_case_table_names                            | 2                                                                                                                      |
-| open_files_limit                                  | 2560                                                                                                                   |
-| pid_file                                          | /Applications/MAMP/tmp/mysql/mysql.pid                                                                                 |
-| plugin_dir                                        | /Applications/MAMP/Library/lib/plugin/                                                                                 |
-| port                                              | 8889                                                                                                                   |
-| pseudo_thread_id                                  | 4465                                                                                                                   |
-| report_port                                       | 8889                                                                                                                   |
-| slave_load_tmpdir                                 | /var/folders/bj/t8j_n5qx3s1cgsq9c2_7nw4w0000gq/T/                                                                      |
-| slow_query_log_file                               | /Applications/MAMP/db/mysql/Joes-Mac-slow.log                                                                          |
-| socket                                            | /Applications/MAMP/tmp/mysql/mysql.sock                                                                                |
-| system_time_zone                                  | PDT                                                                                                                    |
-| timestamp                                         | 1477524853                                                                                                             |
-| tmpdir                                            | /var/folders/bj/t8j_n5qx3s1cgsq9c2_7nw4w0000gq/T/                                                                      |
-| version                                           | 5.5.33                                                                                                                 |
-| version_comment                                   | Source distribution                                                                                                    |
-| version_compile_machine                           | i386                                                                                                                   |
-| version_compile_os                                | osx10.6                                                                                                                |


Live Environment

+| basedir                                           | /usr                                                                                                                   |
+| character_sets_dir                                | /usr/share/mysql/charsets/                                                                                             |
+| datadir                                           | /var/lib/mysql/                                                                                                        |
+| general_log_file                                  | /var/lib/mysql/vps-1003610-982.log                                                                                     |
+| have_openssl                                      | DISABLED                                                                                                               |
+| have_ssl                                          | DISABLED                                                                                                               |
+| have_symlink                                      | DISABLED                                                                                                               |
+| hostname                                          | vps-1003610-982.clients.robsoninc.com                                                                                  |
+| innodb_additional_mem_pool_size                   | 524288                                                                                                                 |
+| innodb_buffer_pool_size                           | 5242880                                                                                                                |
+| innodb_log_buffer_size                            | 512000                                                                                                                 |
+| innodb_thread_concurrency                         | 2                                                                                                                      |
+| innodb_use_native_aio                             | ON                                                                                                                     |
+| innodb_version                                    | 5.5.52                                                                                                                 |
+| lc_messages_dir                                   | /usr/share/mysql/                                                                                                      |
+| local_infile                                      | OFF                                                                                                                    |
+| log_error                                         | /var/log/mysqld.log                                                                                                    |
+| lower_case_file_system                            | OFF                                                                                                                    |
+| lower_case_table_names                            | 0                                                                                                                      |
+| open_files_limit                                  | 1024                                                                                                                   |
+| pid_file                                          | /var/run/mysqld/mysqld.pid                                                                                             |
+| plugin_dir                                        | /usr/lib64/mysql/plugin/                                                                                               |
+| port                                              | 3306                                                                                                                   |
+| pseudo_thread_id                                  | 29088                                                                                                                  |
+| report_port                                       | 3306                                                                                                                   |
+| slave_load_tmpdir                                 | /tmp                                                                                                                   |
+| slow_query_log_file                               | /var/lib/mysql/vps-1003610-982-slow.log                                                                                |
+| socket                                            | /var/lib/mysql/mysql.sock                                                                                              |
+| system_time_zone                                  | MSD                                                                                                                    |
+| thread_statistics                                 | OFF                                                                                                                    |
+| timestamp                                         | 1477524780                                                                                                             |
+| tmpdir                                            | /tmp                                                                                                                   |
+| userstat                                          | OFF                                                                                                                    |
+| version                                           | 5.5.52-cll-lve                                                                                                         |
+| version_comment                                   | MySQL Community Server (GPL) by Atomicorp                                                                              |
+| version_compile_machine                           | x86_64                                                                                                                 |
+| version_compile_os                                | Linux                                                                                                                  |

Question 2

Does the database character settings affect the output (by output I mean what I see in the browser) at all?

Update

Posted wrong Staging site database status output. I forgot the site was on a different staging site. Will get that info soon.

Update 2

I checked the php info to see which MySQL it was using and found these two lines differ than my other environments:

Client API version  5.1.59

Client API library version  5.1.59

Does that mean its using MySQL version 5.1.59 when used with PHP? If so how would I go about updating that part after I initially made the upgrade from 5.1 to 5.5 in CentOS?

Update 3

On the live site I have ran the following query:

SELECT post_title FROM wp_posts WHERE LENGTH(post_content) > CHAR_LENGTH(post_content);

This gave me all the results where the encoding is different (or something like that). I received a bunch of rows in the query and one of them was the content that I was having an issue with. My assumption is that utf8mb4 is being imported into utf8 but the tables and columns are all utf8mb4 encoding and collation. So I don't know where the character sets transform between this transaction. I have tested the strings in PHP to test the encoding and it says its UTF-8 which means it could be utf8 or utf8mb4 in mysql.

It could also be worth mentioning that I have been using Sequel Pro to import and export the data back and forth. So a possibility that Sequel Pro could be interfering with the importing/exporting of the sql data

Update 4

Tried this command:

ALTER DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Tested it out and no change. Then I tried to import the sql file again and still, no change

Best Answer

Question mark in a black diamond occurs in two cases. See this for details.

The black diamonds probably occurred by not setting the connection to be utf8/utf8mb4 when reading the data.

To dig further, we need to see some specifics, preferably dumped in hex.

If the data was correctly stored into CHARACTER SET utf8 in the old system, then the dump would have data that could correctly be loaded into columns defined either CHARACTER SET utf8 or CHARACTER SET utf8mb4. That is there is full compatibility in that direction. The utf8mb4 of 5.5.3+ is a superset of utf8 (before or after 5.5.3).

The dump was using mysqldump? Which version? The 5.1 version? Or a post-5.5.3 version? (I don't think it matters, but please check anyway.)

And what arguments were on mysqldump? This could be important, since, if it is converting to latin1, that could explain the mess.

Are you sure about the dump file? I see CHARSET=utf8mb4.

Q2: Does the database character settings affect the output at all? -- It is complicated. And what do you mean by "output"? The black diamonds are all the way out at the browser. We need to check what is in the table (see "Test the Data" in the link).