Mysql – Why does MySQL optimize query with “WHERE … OR” so badly

execution-planjoin;MySQLperformancequery-performancewhere

Hey there,
this is my first question so please be patient if I miss to add something in here.

My question is why the OR in the WHERE clause causes the query to be executed the way it is (which is poorly). There is some detailed information about my scenario below.

Server Version is ArchLinux MariaDB 10.1.35-1


While I was trying to improve the wall time of a query I noticed that the optimizer does some bad optimizing for a relatively simple query. In this simplified scenario I have two tables, person and contract. A contract has one or two affiliates (foreign key to person).

describe person;
+-----------+-------------+------+-----+---------+----------------+
| Field     | Type        | Null | Key | Default | Extra          |
+-----------+-------------+------+-----+---------+----------------+
| id        | int(11)     | NO   | PRI | NULL    | auto_increment |
| firstname | varchar(45) | YES  |     | NULL    |                |
| lastname  | varchar(45) | YES  | MUL | NULL    |                |
+-----------+-------------+------+-----+---------+----------------+

describe contract;
+------------+-------------+------+-----+---------+----------------+
| Field      | Type        | Null | Key | Default | Extra          |
+------------+-------------+------+-----+---------+----------------+
| id         | int(11)     | NO   | PRI | NULL    | auto_increment |
| state      | varchar(45) | YES  | MUL | NULL    |                |
| affiliate1 | int(11)     | YES  | MUL | NULL    |                |
| affiliate2 | int(11)     | YES  |     | NULL    |                |
+------------+-------------+------+-----+---------+----------------+

There are indexes on (state), (state,affiliate1), (state,affiliate2), (state,affiliate1,affiliate2) and (lastname).


The query I was working on should summarize how many contracts are in which state, filtered by the lastname of either affiliate. For example I execute a query for the lastname 'Doe' I get something like

+-------------+----------+
| state       | count(*) |
+-------------+----------+
| CREATED     | 5        |
| IN PROGRESS | 1        |
| ACCEPTED    | 2        |
+-------------+----------+

The initial query looked like this

SELECT state,count(*) 
FROM contract 
INNER JOIN person a1 ON a1.id=contract.affiliate1 
LEFT OUTER JOIN person a2 ON a2.id=contract.affiliate2
WHERE a1.lastname=? OR a2.lastname=?
GROUP BY state;

I tested this query on a database with 1 million contracts, each having two distinct affiliates (so there are 2 million persons). This query took about 3 seconds to execute. If a left out one of a1.lastname=? or a2.lastname=? the query takes about 0.01 second. In the end I came up with using a UNION to produce my result in a single query (which is very fast, too).

Here is the execution plan for the slow query I got from the slow query log:

# Query_time: 2.449397  Lock_time: 0.001430  Rows_sent: 1  Rows_examined: 3000000
# Rows_affected: 0
# Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: No
#
# explain: id   select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered        r_filtered      Extra
# explain: 1    SIMPLE  contract  index   UK_contract_affiliate1,FK_contract_affiliate2     id_a1_a2      13      NULL    902857  1000000.00      100.00  100.00  Using where; Using index
# explain: 1    SIMPLE  a1     eq_ref  PRIMARY,lastname        PRIMARY 4       contract.affiliate1     1       1.00    100.00  100.00  
# explain: 1    SIMPLE  a2     eq_ref  PRIMARY,lastname        PRIMARY 4       contract.affiliate2     1       1.00    100.00  0.00    Using where
#

EDIT: As requested, the EXPLAIN output

EXPLAIN     SELECT  contract.state
    FROM  contract
    INNER JOIN  person kn1  ON contract.id_affiliate1=affiliate1.id
    WHERE  affiliate1.lastname='ABC' 

+------+-------------+----------+--------+------------------------------------------------------------+---------------+---------+----------------------+------+--------------------------+
| id   | select_type | table    | type   | possible_keys                                              | key           | key_len | ref                  | rows | Extra                    |
+------+-------------+----------+--------+------------------------------------------------------------+---------------+---------+----------------------+------+--------------------------+
|    1 | SIMPLE      | kn1      | ref    | PRIMARY,lastname                                           | lastname      | 272     | const                |   13 | Using where; Using index |
|    1 | SIMPLE      | contract | eq_ref | UK_contract_affiliate1,state,affiliate1_type_status,affiliate2_type_status | UK_contract_affiliate1 | 4       | affiliate1.id |    1 | Using where              |
+------+-------------+----------+--------+------------------------------------------------------------+---------------+---------+----------------------+------+--------------------------+

EXPLAIN     SELECT  contract.state
    FROM  contract
    INNER JOIN  person a1  ON contract.affiliate1=a1.id a1.lastname='ABC'
    UNION  ALL 
SELECT  contract.state
    FROM  contract
    INNER JOIN  person a2  ON contract.affiliate2=a2.id a2.lastname='ABC'
+------+-------------+------------+--------+----------------------------------------------------------------+------------------------+---------+----------------------+------+--------------------------+
| id   | select_type | table      | type   | possible_keys                                                  | key                    | key_len | ref                  | rows | Extra                    |
+------+-------------+------------+--------+----------------------------------------------------------------+------------------------+---------+----------------------+------+--------------------------+
|    1 | PRIMARY     | affiliate1 | ref    | PRIMARY,lastname                                               | lastname               | 272     | const                |    1 | Using where; Using index |
|    1 | PRIMARY     | contract   | eq_ref | UK_contract_affiliate1,state,affiliate1_state,affiliate2_state | UK_contract_affiliate1 | 4       | affiliate1.id        |    1 | Using where              |
|    2 | UNION       | affiliate2 | ref    | PRIMARY,lastname                                               | lastname               | 272     | const                |    1 | Using where; Using index |
|    2 | UNION       | contract   | ref    | FK_contract_affiliate2,state,affiliate1_state,affiliate2_state | FK_contract_affiliate2 | 5       | affiliate2.id        |    1 | Using where              |
+------+-------------+------------+--------+----------------------------------------------------------------+------------------------+---------+----------------------+------+--------------------------+

Best Answer

OR does not optimize well because it would involve searching for all possible rows with one of the things, then check for the other, then combine the two lists.

Your original query is even worse since the things being OR'd together come from different tables. MySQL really punts on that. So it reads both tables entirely.

The UNION does what the Optimizer can't do; that is focus on one part of the OR at a time.

See my index cookbook