MySQL Collation – Difference Between utf8mb4_0900_as_cs and utf8mb4_0900_ai_ci

collationMySQL

My default collation_connection was 'utf8_generic_ci' and it caused an issue today with a polish character not fitting, thowing an error (the table has utf8mb4_0900_as_ci)

So I changed the connectin to utf8mb4_0900_as_cs but I don't know if that can affect performance, matching or returned data beyond the utf8 encoding.
Is there any difference between ci and cs in the connection collation ?

I of course understand the difference in column/table collation.

Best Answer

Changing the collation of the connection will have little to no impact on performance, but also on what you probably want to achieve. From the manual:

collation_connection is important for comparisons of literal strings. For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence

If you want to make comparisons in with or without case sensitivity/accent sensitivity, you will have to update the actual column's collation, or force the collation on every comparison manually (e.g. WHERE ). The latter may have low impact if no index is required, or a huge impact if the usage of an index is prevented because the comparison and the column use different collation.

In practice:

mysql> SET NAMES utf8mb4; -- we will be using this charset to send literals
mysql> SELECT 'A' = 'a';
+-----------+
| 'A' = 'a' |
+-----------+
|         1 |
+-----------+
1 row in set (0.000 sec)

mysql> SET SESSION collation_connection = 'utf8mb4_bin';
Query OK, 0 rows affected (0.000 sec)

mysql> SELECT 'A' = 'a'; -- it works on literals, but you will have to force it manually on columns
+-----------+
| 'A' = 'a' |
+-----------+
|         0 |
+-----------+
1 row in set (0.000 sec)

mysql> MariaDB [(none)]> SELECT 'A' = 'a' COLLATE utf8mb4_bin; -- may prevent index usage on columns
+-------------------------------+
| 'A' = 'a' COLLATE utf8mb4_bin |
+-------------------------------+
|                             0 |
+-------------------------------+
1 row in set (0.000 sec)

mysql> select 'A' = 'a' COLLATE utf8mb4_unicode_520_ci;  -- may prevent index usage on columns
+------------------------------------------+
| 'A' = 'a' COLLATE utf8mb4_unicode_520_ci |
+------------------------------------------+
|                                        1 |
+------------------------------------------+
1 row in set (0.004 sec)