Mysql – How to find duplicate key violations in a different collation

collationMySQL

I have a MySQL table that has the collation for one of its varchar columns in the primary key set to utf8_general_ci. I want to change the collation to utf8_unicode_ci so it stops treating "gmail.com" and "gmàil.com" like they're the same item. I tried running this:

alter table my_table change character set utf8 collate utf8_unicode_ci;

The alter fails with a duplicate key violation on one of the items.

Error 1062 (23000): Duplicate entry 2013-07-15-119-comcast.net for key PRIMARY

I tried selecting the items out of the table that have date equal to 2013-07-15 and source_id equal to 119, but eyeballing the results I don't see two entries for "comcast.net" that might be causing the problem.

Can I construct a query into this table to determine which two rows are different under utf8_general_ci but the same under utf8_unicode_ci?

Best Answer

A query like this did the trick:

select date, thing_id, email_domain collate utf8_unicode_ci
from my_table
where date = '2013-07-15'
and thing_id = 119
and email_domain collate utf8_unicode_ci = 'comcast.net';