MySQL – Index Not Used on Partitioned Table

explainindexMySQLpartitioning

We have a table similar to this one

mysql> SHOW CREATE TABLE my_requests\G
*************************** 1. row ***************************
       Table: my_requests
Create Table: CREATE TABLE `my_requests` (
  `rq_id` bigint(20) NOT NULL,
  `t_id` bigint(20) NOT NULL,
  `u_id` bigint(20) DEFAULT NULL,
  `rq_date` datetime DEFAULT NULL,
  `rq_type` tinyint(4) DEFAULT '1',
  `rq_creationdate` datetime NOT NULL,
  `rq_modifydate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `rq_ttotal` mediumint(8) unsigned DEFAULT '0',
  KEY `index_rqcd` (`t_id`,`rq_date`),
  KEY `idx_rq_u` (`u_id`),
  KEY `idx_rq_id` (`rq_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY LIST (to_days(rq_date))
SUBPARTITION BY HASH (t_id)
SUBPARTITIONS 10
(PARTITION p736417 VALUES IN (736417) ENGINE = InnoDB,
 PARTITION p736419 VALUES IN (736419) ENGINE = InnoDB,
 PARTITION p736430 VALUES IN (736430) ENGINE = InnoDB,
 PARTITION p736439 VALUES IN (736439) ENGINE = InnoDB,
 PARTITION p736443 VALUES IN (736443) ENGINE = InnoDB,
 PARTITION p736444 VALUES IN (736444) ENGINE = InnoDB,
 PARTITION p736445 VALUES IN (736445) ENGINE = InnoDB,
 PARTITION p736446 VALUES IN (736446) ENGINE = InnoDB,
 PARTITION p736447 VALUES IN (736447) ENGINE = InnoDB,
 PARTITION p736448 VALUES IN (736448) ENGINE = InnoDB,
 PARTITION p736449 VALUES IN (736449) ENGINE = InnoDB,
 PARTITION p736450 VALUES IN (736450) ENGINE = InnoDB,
 PARTITION p736451 VALUES IN (736451) ENGINE = InnoDB,
 PARTITION p736452 VALUES IN (736452) ENGINE = InnoDB,
 PARTITION p736453 VALUES IN (736453) ENGINE = InnoDB,
 PARTITION p736454 VALUES IN (736454) ENGINE = InnoDB,
 PARTITION p736455 VALUES IN (736455) ENGINE = InnoDB,
 PARTITION p736456 VALUES IN (736456) ENGINE = InnoDB,
 PARTITION p736457 VALUES IN (736457) ENGINE = InnoDB,
 PARTITION p736458 VALUES IN (736458) ENGINE = InnoDB,
 PARTITION p736459 VALUES IN (736459) ENGINE = InnoDB,
 PARTITION p736460 VALUES IN (736460) ENGINE = InnoDB,
 PARTITION p736461 VALUES IN (736461) ENGINE = InnoDB,
 PARTITION p736462 VALUES IN (736462) ENGINE = InnoDB,
 PARTITION p736463 VALUES IN (736463) ENGINE = InnoDB,
 PARTITION p736464 VALUES IN (736464) ENGINE = InnoDB,
 PARTITION p736465 VALUES IN (736465) ENGINE = InnoDB,
 PARTITION p736466 VALUES IN (736466) ENGINE = InnoDB,
 PARTITION p736467 VALUES IN (736467) ENGINE = InnoDB,
 PARTITION p736468 VALUES IN (736468) ENGINE = InnoDB,
 PARTITION p736469 VALUES IN (736469) ENGINE = InnoDB) */

with partitions by day, the table is huge and has hundreds of millions of registers, and I'm trying to select the data from the last day and avoid a full scan and access only the last partitions.

But when I do a explain the index is not used.

EXPLAIN SELECT * FROM my_requests USE INDEX(index_rqcd)
     WHERE t_id <> -1 AND rq_date = DATE_SUB(now(), INTERVAL 1 DAY);
+----+-------------+-------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table       | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | my_requests | ALL  | index_rqcd    | NULL | NULL    | NULL | 4020468 | Using where |
+----+-------------+-------------+------+---------------+------+---------+------+---------+-------------+

Edit:
The EXPLAIN PARTITIONS shows that the query only affect part of the partitions, but I still don't understand why the index is not used.

mysql> EXPLAIN PARTITIONS SELECT * FROM my_requests
         WHERE t_id <> -1 AND rq_date = DATE_SUB(now(), INTERVAL 1 DAY)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: my_requests
   partitions: p736467_p736467sp0,p736467_p736467sp1,p736467_p736467sp2,p736467_p736467sp3,p736467_p736467sp4,p736467_p736467sp5,p736467_p736467sp6,p736467_p736467sp7,p736467_p736467sp8,p736467_p736467sp9
         type: ALL
possible_keys: index_rqcd
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4064737
        Extra: Using where
1 row in set (0.06 sec)

Can you help me to understand why this index is not used for that query?

Best Answer

There are a few problems here:

First, as jkavalik says in the comments on the OP, the order of columns in an index matters. Basically, in your case for index_rqcd to be used for filtering on rq_date, t_id has to be used before it can "see" and filter on rq_date. Since usually only one range scan on an index can be done for a query and it has to be the last part of the index used, the optimizer correctly skips using this index because it would probably have to use the entire index which cause a full scan on the index and then a full scan on the table, or in this case partition. If the table and had an index starting with rq_date, that index would likely be used.

Second, something you should be aware of when using partitions is that MySQL doesn't have global indexes for partitioned tables. What this means is that each partition has its own indexes and therefore can only be used inside that partition. Since the optimizer knows that the query only needs to use partition p736467 based on the query and table definition (an optimization known as partition pruning), only that partition, and in your case the associated sub-partitions, are considered when running the query. So now that the optimizer is only using partition p736467, it can consider using index_rqcd, but since it is still affected by the first problem and its use would lead to a full scan on the index, it chooses to do a full scan on the data. Even though the EXPLAIN shows ALL for the type, it is only scanning the data in the p736467 partition, not the entire table, because of the partition pruning. Even if you add an index starting with rq_date, it will not be used if you want to get all of the rows for the day the partition is for. However, the query you give will only return rows for exactly 1 day ago down to the second, not the whole day as you say when describing what you want to do. If you really do want to get rows for a specific time and not the whole day, and index starting with rq_date will be useful.

Probably the most important thing here is that, as mentioned by the link jkavalik posted, you shouldn't use partitions unless you have a good reason to. Time ranges can be a valid use for partitions, but usually only when used with a sliding window of chronological data, e.g. 7 days of log messages that are pruned daily. Generally unless you can use partition pruning in (almost) every query, they will hurt performance since every partition will likely need to be accessed. If you are always querying this table for rows on a specific date and then filtering on a range on some other indexed column, partitions might be useful. More likely, partitions are not the right way to go. Unless you have a very good reason to use the sub-partitions, they are almost certainly just adding overhead to every query.

My suggestion, assuming that most of your queries are based on a range of rq_date, is to not use partitions and add an index starting with rq_date.

Also, when using InnoDB, you should always define a primary key. I don't know about the data you're storing, but it looks like rq_id might be unique and a good candidate.