DB and fields are UTF8_unicode_ci collation – MySql v 5.6
HTML charset is utf 8
Double byte script was entered successfully prior to this issue (Chinese, Japanese, plus Cyrillic etc) when using plain non encrypted input.
But trying to use AES_ENCRYPT on input mangles all non-english script.
DB field is BLOB.
UPDATE mytable SET bodytext=AES_ENCRYPT('任車心是,防少劇和的;子反華不就','aeskey') WHERE t_id=123
produces strings of question marks – English is fine though
I'm not sure if I should try to CONVERT the input to UTF8 first – but when I try that I get an error.
Any suggestions of what to review?
Best Answer
Forcing utf8mb4 does the trick. Run a query against MySQL immediately after making the connection with:
SET NAMES
'utf8mb4';
Alternatively, use these settings in your my.cnf file to get the same effect:
(Note: These settings are defaults (for the most part) in MySQL 8.0 and MariaDB 10.3, so this probably isn't a problem there.)
Chinese (and other) scripts are multibyte - standard utf8 allows 3 bytes per character - fine for plain text but using AES_ENCRYPT ON 3 bytes requires another byte - thus setting utf8mb4 allows the required 4 bytes per character.