Mysql – Upgrade all MySQL columns, tables, and databases from utf8mb3 to utf8mb4

amazon-rdsmysql-5.7mysql-8.0upgradeutf-8

I have thousands of columns across hundreds of tables in about a hundred databases inside a MySQL instance that need to be upgraded from utf8mb3 to utf8mb4. Is there a way to generate ALTER statements for every table and column that needs to change?

I'm upgrading MySQL hosted on Amazon RDS from MySQL 5.7 to 8.0. The Pre-patch compatibility tool tells me:

The following objects use the utf8mb3 character set. It is recommended to convert them to use utf8mb4 instead, for improved Unicode support.

More Information:
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html

It then lists about 6,000 databases and columns that need to be updated:

mydb - schema's default character set: utf8
mydb.mytable.mycolumn - column's default character set: utf8

I'd like to be able to generate all the ALTER statements I'll need to run similar to how MySQL command querying all MyISAM database helped my convert all my tables from MyISAM to Innodb.

Best Answer

Here is a query what will generate all the necessary update statements.

  • It temporarily disables foreign key checks so that the update statements succeed.
  • It updates the default character set for each database that needs it.
  • It updates the default character set of each table:
    • If it doesn't have the right default character set.
    • If it contains text columns that are not the right character set.
    • The same ALTER TABLE ... CONVERT statement both updates the table default and all the columns within the table.

SELECT
    /* Disable foreign key checks temporily to be able to make these changes */
    'SET FOREIGN_KEY_CHECKS = 0;' AS alter_statement 
UNION SELECT 
    /* Alter the default character set of each database */
    CONCAT('ALTER DATABASE `', SCHEMA_NAME,'` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;') AS alter_statement 
FROM 
    information_schema.SCHEMATA 
WHERE 
    DEFAULT_CHARACTER_SET_NAME!='utf8mb4' AND 
    SCHEMA_NAME NOT IN('mysql','information_schema','performance_schema','sys')
UNION SELECT
    /* Alter the default character set of each table .
      This also converts all text columns in the table,
      So there is no need to have a statement to alter each
      column individually */
    DISTINCT CONCAT('ALTER TABLE `', TABLE_SCHEMA,'`.`',TABLE_NAME, '` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;') AS alter_statement 
FROM
(
    SELECT
        /* Find all tables with a text column that isn't utf8mb4 */
        TABLE_SCHEMA, TABLE_NAME
    FROM
        information_schema.COLUMNS 
    WHERE
        TABLE_SCHEMA NOT IN('mysql','information_schema','performance_schema','sys') AND 
        CHARACTER_SET_NAME IS NOT NULL AND 
        CHARACTER_SET_NAME!='utf8mb4'
    UNION SELECT
        /* Also find all tables that don't have the correct default character set */
        TABLE_SCHEMA, TABLE_NAME
    FROM 
        information_schema.TABLES AS T
    JOIN 
        information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C ON C.collation_name = T.table_collation
    WHERE 
        CHARACTER_SET_NAME!='utf8mb4' AND 
        TABLE_SCHEMA NOT IN('mysql','information_schema','performance_schema','sys')
) AS TABLE_UPDATES
UNION SELECT
    /* Re-enable forign key checks */
    'SET FOREIGN_KEY_CHECKS = 1;' AS alter_statement 
;

When run, it should generate output like:

SET FOREIGN_KEY_CHECKS = 0;
ALTER DATABASE `mydb` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `mydb`.`mytable` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
SET FOREIGN_KEY_CHECKS = 1;

Sources: