Mariadb – Can’t search for non-ascii characters

encodingmariadb

I have a database with Turkish names and surnames. Here is some information about my table:

CREATE TABLE `sorted_tc` (
  `TCKIMLIKNO` double NOT NULL,
  `ADI` varchar(40) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `SOYADI` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `ANAADI` varchar(40) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `BABAADI` varchar(40) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `DOGUMYERI` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `DOGUMTARIHI` varchar(15) COLLATE utf8_unicode_ci NOT NULL,
  `NUFUSILI` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `NUFUSILCESI` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `ADRESILI` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `ADRESILCESI` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  PRIMARY KEY (`TCKIMLIKNO`),
  KEY `anababa` (`BABAADI`,`ANAADI`),
  KEY `adsoyad` (`SOYADI`,`ADI`),
  KEY `adresili` (`ADRESILI`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

select hex(ADI), hex(SOYADI) from sorted_tc WHERE TCKIMLIKNO=12161429422;

5941C59E4152 41524142414349

5941C59E4152 is hex value for utf8 encoded 'YAŞAR'.

When I search for 'yasar', I want to get both 'yasar' and 'yaşar'. Similiarly, when I search for 'yaşar', I should get both 'yasar' and 'yaşar'.

However, when I search for 'yaşar', I am not getting any rows. Here is hex dump of relevant query from mysql query log:

0004-15a0:  51 75 65 72-79 09 73 65-6c 65 63 74-20 2a 20 66  Query.se lect.*.f
0004-15b0:  72 6f 6d 20-73 6f 72 74-65 64 5f 74-63 20 57 48  rom.sort ed_tc.WH
0004-15c0:  45 52 45 20-41 44 49 3d-22 79 61 c5-9f 61 72 22  ERE.ADI= "ya..ar"
0004-15d0:  20 41 4e 44-20 53 4f 59-41 44 49 3d-22 61 72 61  .AND.SOY ADI="ara
0004-15e0:  62 61 63 c4-b1 22 0a 31-35 31 32 32-36 20 31 34  bac..".1 51226.14

My table orignally used collation utf-8 general ci, and I could search for 'yaşar' with success.

I am using Mariadb 5.3

Best Answer

Sometimes ago, I faced a similar issue and I solved that issue with the help of my friend. How I did was I changed the collation to uft8_turkish_ci in MySQL. Also from MySQL website, I found the following document, which may be helpful for you,

MySQL Character set

Thanks.