Mysql – optimizer uses all the wrong indexes instead of one right

eavmariadbMySQLoptimization

This is another stupid question about optimizer. Synopsis is exactly as the subject is: the optimizer does it wrong. Seems like is has an absolute desire to use any index that contains REALM_ID field: I can create dozens of composite or simple indexes that contain this field, and as long as I add them into the IGNORE INDEX clause, it just picks another one. The thing is, this is actually a Keycloak with a custom app, so there's a layer of Hibernate framework abstraction, so instead of optimiser hints I'd rather change the whole query just in order to make optimizer just choose the primary key. I've also tried to add order by UE.ID clause to minify the influence of REALM_ID, but this doesn't help actually.

Table:

USER_ENTITY | CREATE TABLE `USER_ENTITY` (
  `ID` varchar(36) NOT NULL,
  `EMAIL` varchar(255) DEFAULT NULL,
  `EMAIL_CONSTRAINT` varchar(255) DEFAULT NULL,
  `EMAIL_VERIFIED` bit(1) NOT NULL DEFAULT b'0',
  `ENABLED` bit(1) NOT NULL DEFAULT b'0',
  `FEDERATION_LINK` varchar(255) DEFAULT NULL,
  `FIRST_NAME` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `LAST_NAME` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `REALM_ID` varchar(255) DEFAULT NULL,
  `USERNAME` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `CREATED_TIMESTAMP` bigint(20) DEFAULT NULL,
  `SERVICE_ACCOUNT_CLIENT_LINK` varchar(36) DEFAULT NULL,
  `NOT_BEFORE` int(11) NOT NULL DEFAULT 0,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `UK_RU8TT6T700S9V50BU18WS5HA6` (`REALM_ID`,`USERNAME`),
  UNIQUE KEY `UK_DYKN684SL8UP1CRFEI6ECKHD7` (`REALM_ID`,`EMAIL_CONSTRAINT`),
  KEY `IDX_USER_EMAIL` (`EMAIL`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

Query:

select UE.ID,
    UE.CREATED_TIMESTAMP,
    UE.EMAIL,
    UE.EMAIL_CONSTRAINT,
    UE.EMAIL_VERIFIED,
    UE.ENABLED,
    UE.FEDERATION_LINK,
    UE.FIRST_NAME,
    UE.LAST_NAME,
    UE.NOT_BEFORE,
    UE.REALM_ID,
    UE.SERVICE_ACCOUNT_CLIENT_LINK,
    UE.USERNAME
from USER_ENTITY UE inner join USER_ATTRIBUTE AT on UE.ID=AT.USER_ID
where UE.REALM_ID='user' and
    AT.NAME='phone' and
    (NULL is null or UE.ID<>NULL) and
    AT.VALUE='01370003403';

Torture sequence:

Untuned:

MariaDB [sso]> select UE.ID, UE.CREATED_TIMESTAMP, UE.EMAIL, UE.EMAIL_CONSTRAINT, UE.EMAIL_VERIFIED, UE.ENABLED, UE.FEDERATION_LINK, UE.FIRST_NAME, UE.LAST_NAME, UE.NOT_BEFORE, UE.REALM_ID, UE.SERVICE_ACCOUNT_CLIENT_LINK, UE.USERNAME from USER_ENTITY UE inner join USER_ATTRIBUTE AT on UE.ID=AT.USER_ID where UE.REALM_ID='user' and AT.NAME='phone' and (NULL is null or UE.ID<>NULL) and AT.VALUE='01370003403';
[..]
1 row in set (1.257 sec)

Plan for untuned:

MariaDB [sso]> explain select UE.ID, UE.CREATED_TIMESTAMP, UE.EMAIL, UE.EMAIL_CONSTRAINT, UE.EMAIL_VERIFIED, UE.ENABLED, UE.FEDERATION_LINK, UE.FIRST_NAME, UE.LAST_NAME, UE.NOT_BEFORE, UE.REALM_ID, UE.SERVICE_ACCOUNT_CLIENT_LINK, UE.USERNAME from USER_ENTITY UE inner join USER_ATTRIBUTE AT on UE.ID=AT.USER_ID where UE.REALM_ID='user' and AT.NAME='phone' and (NULL is null or UE.ID<>NULL) and AT.VALUE='01370003403';
+------+-------------+-------+------+-------------------------------------------------------------------------------------+------------------------------------+---------+-----------------------+-------+--------------------------+
| id   | select_type | table | type | possible_keys                                                                       | key                                | key_len | ref                   | rows  | Extra                    |
+------+-------------+-------+------+-------------------------------------------------------------------------------------+------------------------------------+---------+-----------------------+-------+--------------------------+
|    1 | SIMPLE      | UE    | ref  | PRIMARY,UK_RU8TT6T700S9V50BU18WS5HA6,UK_DYKN684SL8UP1CRFEI6ECKHD7                   | UK_RU8TT6T700S9V50BU18WS5HA6       | 1023    | const                 | 90582 | Using index condition    |
|    1 | SIMPLE      | AT    | ref  | IDX_USER_ATTRIBUTE,IDX_USER_ATTRIBUTE_USER_VALUE,IDX_USER_ATTRIBUTE_USER_NAME_VALUE | IDX_USER_ATTRIBUTE_USER_NAME_VALUE | 1936    | sso.UE.ID,const,const | 1     | Using where; Using index |
+------+-------------+-------+------+-------------------------------------------------------------------------------------+------------------------------------+---------+-----------------------+-------+--------------------------+
2 rows in set (0.001 sec)

Tuned:

MariaDB [sso]> select UE.ID, UE.CREATED_TIMESTAMP, UE.EMAIL, UE.EMAIL_CONSTRAINT, UE.EMAIL_VERIFIED, UE.ENABLED, UE.FEDERATION_LINK, UE.FIRST_NAME, UE.LAST_NAME, UE.NOT_BEFORE, UE.REALM_ID, UE.SERVICE_ACCOUNT_CLIENT_LINK, UE.USERNAME from USER_ENTITY UE IGNORE INDEX(UK_RU8TT6T700S9V50BU18WS5HA6, UK_DYKN684SL8UP1CRFEI6ECKHD7) inner join USER_ATTRIBUTE AT on UE.ID=AT.USER_ID where UE.REALM_ID='user' and AT.NAME='phone' and (NULL is null or UE.ID<>NULL) and AT.VALUE='01370003403';
    [...]
1 row in set (0.088 sec)

Plan for tuned:

MariaDB [sso]> explain select UE.ID, UE.CREATED_TIMESTAMP, UE.EMAIL, UE.EMAIL_CONSTRAINT, UE.EMAIL_VERIFIED, UE.ENABLED, UE.FEDERATION_LINK, UE.FIRST_NAME, UE.LAST_NAME, UE.NOT_BEFORE, UE.REALM_ID, UE.SERVICE_ACCOUNT_CLIENT_LINK, UE.USERNAME from USER_ENTITY UE IGNORE INDEX(UK_RU8TT6T700S9V50BU18WS5HA6, UK_DYKN684SL8UP1CRFEI6ECKHD7) inner join USER_ATTRIBUTE AT on UE.ID=AT.USER_ID where UE.REALM_ID='user' and AT.NAME='phone' and (NULL is null or UE.ID<>NULL) and AT.VALUE='01370003403';
+------+-------------+-------+--------+-------------------------------------------------------------------------------------+------------------------------------+---------+----------------+--------+--------------------------+
| id   | select_type | table | type   | possible_keys                                                                       | key                                | key_len | ref            | rows   | Extra                    |
+------+-------------+-------+--------+-------------------------------------------------------------------------------------+------------------------------------+---------+----------------+--------+--------------------------+
|    1 | SIMPLE      | AT    | index  | IDX_USER_ATTRIBUTE,IDX_USER_ATTRIBUTE_USER_VALUE,IDX_USER_ATTRIBUTE_USER_NAME_VALUE | IDX_USER_ATTRIBUTE_USER_NAME_VALUE | 1938    | NULL           | 178593 | Using where; Using index |
|    1 | SIMPLE      | UE    | eq_ref | PRIMARY                                                                             | PRIMARY                            | 146     | sso.AT.USER_ID | 1      | Using where              |
+------+-------------+-------+--------+-------------------------------------------------------------------------------------+------------------------------------+---------+----------------+--------+--------------------------+
2 rows in set (0.001 sec)

Best Answer

EAV schemas suck. But this instance of it can be improved.

The fix (for the sluggishness)

ALTER TABLE USER_ATTRIBUTE
    ADD INDEX(NAME, VALUE, USER_ID);

Please provide SHOW CREATE TABLE USER_ATTRIBUTE -- there may be other improvements that can be made.

Why failed to use index(realm_id) ...

When looking up something by a secondary key, InnoDB first drills down the index's BTree to find a leaf row. That leaf row contains the PRIMARY KEY. Then it has to drill down the data's BTree to find the desired row. This 2-step process has some slowness.

The Optimizer says "if more than ~20% of the rows have REALM_ID='user', then don't bother with the secondary index; simply scan the table".

A JOIN is almost always performed by NLJ (Nested Loop Join), meaning that it gets the desired rows from one table. As it gets a row, it reaches into the next table.

My suggested index to the other table will encourage the Optimizer to start with the attribute table. This will probably finish much faster.