Mysql – Index not being used for logical operation

indexindex-tuningMySQL

We have a MySQL table with a few million rows. We have set of users identified by a flag which int value. We are facing a problem that mysql is not using index. For example, following query is very slow:

select email, flag from usertable where (flag&0x40)!=0 limit 10;

Following is explain

> explain select email, flag from usertable where (flag&0x40)!=0 limit 10;
+------+-------------+-----------+------+---------------+------+---------      +------+---------+-------------+
| id   | select_type | table     | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+------+-------------+-----------+------+---------------+------+---------+------+---------+-------------+
|    1 | SIMPLE      | usertable | ALL  | NULL          | NULL | NULL    | NULL | 8331189 | Using where |
+------+-------------+-----------+------+---------------+------+---------+------+---------+-------------+

and here is indexes

> show index from usertable;
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name    | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| usertable |          0 | PRIMARY     |            1 | uid         | A         |     8331189 |     NULL | NULL   |      | BTREE      |         |               |
| usertable |          0 | email_index |            1 | email       | A         |     8331189 |     NULL | NULL   | YES  | BTREE      |         |               |
| usertable |          1 | flagindex   |            1 | flag        | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |               |
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

As I understand, logical & operation might be prohibiting use of index. What could be the solution in that case?

Best Answer

Seems like you are using an integer column flag where a series of BIT columns or - even better - an intersection table should be used.

The index on (flag) is effectively useless for the condition where (flag & 0x40) != 0. Even if it was used, the whole index would have to be scanned first, then all the values computed (with & 0x40) and compared to 0. After that, the matching rows would have to be fetched from the table, resulting in multiple, even millions of seeks. The optimizer is clever enough to do a simple - and fast but still rather slow - whole table scan.

You could improve the efficiency somewhat with a "covering" index on (flag, email). That index would be used but only with a full index scan - you probably can't avoid that with the current design.


Other suggestions, some involving altering the table:

  • If you can split the flags to multiple BIT columns, the query could be simplified and the possibility of using indexes would arise. Still, you might end up needing multiple (composite) indexes, one for each BIT flag.

  • If the possible combinations of flags is low (as the cardinality = 18 suggests), you could use that info in your favour by changing the condition to:

    where flag in (64, 65, 66, 68, 72, 80, ..., 192, 193, ...)
    -- the list of all possible flag values that have the proper bit set
    

    This might use the (flag, email) index slightly better than before, especially if this list is small, still scanning a small or large part of the index but not the whole index.

  • A much better design would be to use an intersection (many-to-many) table to store which users have some attribute, instead of using flags.