MySQL – Compare Values of Two Large Tables Efficiently

execution-planMySQLperformancequery-performance

On MySQL 5.7 I want to get the records from table 1 that are not present in table 2.

I'm using this simple LEFT JOIN query that is very slow (more than 2 minutes), I don't know why.

SELECT * FROM t1 LEFT JOIN t2 ON t1.name = t2.name WHERE t2.name is null

When I do an explain of the query I get this result

id   select_type     table     partitions    type     possible_keys  key          key_len      ref     rows     filtered     Extra
1    SIMPLE          t1        NULL          index    NULL           NAME_UNIQUE  49           NULL    66387    100.00       Using index
1    SIMPLE          t2        NULL          index    NULL           NAEM_UNIQUE  37           NULL    2275410  10.00        Using where; Using index; Using join buffer (Block Nested Loop)     

I have a unique index on both columns "name".
Here are the tables definitions:

CREATE TABLE `t1` (
`ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`NAME` char(12) CHARACTER SET utf8 DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `NAME_UNIQUE` (`NAME`)
) ENGINE=InnoDB AUTO_INCREMENT=50053 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;


CREATE TABLE `t2` (
`ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`NAME` char(12) DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `NAME_UNIQUE` (`NAME`)
) ENGINE=InnoDB AUTO_INCREMENT=2049985 DEFAULT CHARSET=utf8mb4;

I think it's something related to the use of "Using join buffer (Block nested loop)" but I tried to switch off with no effect:

SET SESSION optimizer_switch='block_nested_loop=off';

Any ideas to make this query perform much better?
Thanks!

Best Answer

Looks like a mismatch between the CHARSET on the two tables. Try setting the same CHARSET and COLLATION on both tables.