MySQL and MariaDB: COLLATE on table vs COLLATE on all columns

collationmariadbMySQLperformance

Will the following table behave or perform any differently

CREATE TABLE t1 (
  foo char(60) COLLATE utf8_bin,
  bar char(64) COLLATE utf8_bin,
  baz char(80) COLLATE utf8_bin
) DEFAULT CHARSET=utf8;

than this table

CREATE TABLE t2 (
  foo char(60),
  bar char(64),
  baz char(80)
) DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

I ask this because when I run SHOW CREATE TABLE mysql.db it shows up with the marker on every column (as in t1), and on the table (as in t2). I don't ever add DEFAULT COLLATE when I create tables. I'm wondering why the system tables have this option set.

Best Answer

  1. In their current form: the behavior and performance of t1 and t2 should be identical,

    and:

  2. IF a string column is added that does not specify either CHARSET or COLLATE clauses: the behavior and performance will be different. This is due to:

    1. new columns in t2 will be CHARSET=utf8 and COLLATE=utf8_bin (because they were specified as defaults for the table), but

    2. new columns in t1 will be CHARSET=utf8 (because that was specified as default for the table), but will also be COLLATE=utf8_general_ci (because this should be the default for the utf8 character set)

Now, I created the two tables as you have them in the question, and that resulted in the following:

SHOW CREATE TABLE t1;

CREATE TABLE `t1` (
  `foo` char(60) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `bar` char(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `baz` char(80) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8

and:

SHOW CREATE TABLE t2;

CREATE TABLE `t2` (
  `foo` char(60) COLLATE utf8_bin DEFAULT NULL,
  `bar` char(64) COLLATE utf8_bin DEFAULT NULL,
  `baz` char(80) COLLATE utf8_bin DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin

I'm not sure why CHARACTER SET became explicitly stated in the stored version of the CREATE TABLE t1 statement, but I think I know why COLLATE did for CREATE TABLE t2: once you submit the statement, that's what it gets translated into as it's being processed. Why just the COLLATE statement and not also CHARACTER SET? That is most likely due to character sets working with 1 or more collations, so specifying the character set does not imply the collation. On the other hand, a collation works with only 1 character set, so specifying the collation does imply the character set. Yes, character sets have default collations, but defaults can change over time or between contexts. Yes, there are defaults explicitly declared for the table, but what if a) the table defaults aren't respected, and/or b) whatever is reading the DDL wants to know everything about a column as it parses the column, as opposed to needing to read the full table DDL first and potentially go back to apply defaults. Adding the COLLATE clauses makes the operation very clear and unambiguous.

Also, the documentation for CREATE TABLE Statement Retention states:

The original CREATE TABLE statement, including all specifications and table options are stored by MySQL when the table is created. The information is retained so that if you change storage engines, collations or other settings using an ALTER TABLE statement, the original table options specified are retained. This enables you to change between InnoDB and MyISAM table types even though the row formats supported by the two engines are different.

I tested this on db<>fiddle against MariaDB 10.2 and 10.3. and MySQL 5.6, 5.7, and 8.0, and the behavior was the same across all of them.