Mysql – do to make thesql use the expected indices

clustered-primary-keyMySQLprimary-key

What can I do to make mysql use the expected indices?

I've got 4 tables, two containing resources, and the others containing historical changes.

One pair uses indexes correctly, the other doesn't, but both are structured almost identically.

I've tried changing the order of the primary keys, and the order of the other keys, I've tried changing table structures so they use the same name in both tables, and both have the same key names, but nothing seems to make the query use the correct index.

Irrelevant columns have been removed for brevity.

These two tables are working as expected.

CREATE TABLE `players` (
  `player_id` varbinary(36) NOT NULL DEFAULT '',
  `pop_rank_score` double NOT NULL DEFAULT '0',
  PRIMARY KEY (`player_id`),
  KEY `pop_rank_score` (`pop_rank_score`),
  KEY `weblinc_id` (`weblinc_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `poprankhistory` (
  `day_id` int(11) NOT NULL,
  `player_id` varbinary(36) NOT NULL DEFAULT '',
  `total` double NOT NULL DEFAULT '0',
  `today` double NOT NULL DEFAULT '0',
  PRIMARY KEY (`day_id`,`player_id`),
  KEY `day_id` (`day_id`),
  KEY `player_id` (`player_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1


explain select p.`player_id`, p.pop_rank_score + 0.5 * COALESCE(h1.total,0) as pop_rank_score
from fpme_lua.`Players` p, fpme_lua.PopRankHistory h1
where ( p.`player_id` = h1.`player_id` AND h1.day_id = (SELECT Max(h2.day_id) AS day_id
  FROM   fpme_lua.poprankhistory h2
  WHERE  h2.day_id <= 15786 and h2.player_id = p.`player_id` ));

+----+--------------------+-------+--------+--------------------------+-----------+---------+-----------------------+-------+--------------------------+
| id | select_type        | table | type   | possible_keys            | key       | key_len | ref                   | rows  | Extra                    |
+----+--------------------+-------+--------+--------------------------+-----------+---------+-----------------------+-------+--------------------------+
|  1 | PRIMARY            | h1    | ALL    | PRIMARY,day_id,player_id | NULL      | NULL    | NULL                  | 25391 |                          |
|  1 | PRIMARY            | p     | eq_ref | PRIMARY                  | PRIMARY   | 38      | fpme_lua.h1.player_id |     1 | Using where              |
|  2 | DEPENDENT SUBQUERY | h2    | ref    | PRIMARY,day_id,player_id | player_id | 38      | fpme_lua.p.player_id  |     2 | Using where; Using index |
+----+--------------------+-------+--------+--------------------------+-----------+---------+-----------------------+-------+--------------------------+

These tables aren't working as expected (required).

CREATE TABLE `pictures` (
  `id` varchar(36) NOT NULL DEFAULT '',
  `pcr_score` double NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `owner_id` (`owner_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `picpcrhistory` (
  `day_id` int(11) NOT NULL,
  `target_id` varchar(36) NOT NULL DEFAULT '',
  `total` double NOT NULL DEFAULT '0',
  `today` double NOT NULL DEFAULT '0',
  PRIMARY KEY (`day_id`,`target_id`),
  KEY `target_id` (`target_id`),
  KEY `day_id` (`day_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1


explain select p.`id`, p.pcr_score + 0.5 * COALESCE(h1.total,0) as pcr_score
from fpme_lua.`Pictures` p, fpme_lua.PicPcrHistory h1
where ( p.`id` = h1.`target_id` AND h1.day_id = (SELECT Max(h2.day_id) AS day_id
  FROM   fpme_lua.PicPcrHistory h2
  WHERE  h2.day_id <= 15786 and h2.`target_id` = p.`id` ));

+----+--------------------+-------+--------+----------------+---------+---------+------+-------+--------------------------+
| id | select_type        | table | type   | possible_keys  | key     | key_len | ref  | rows  | Extra                    |
+----+--------------------+-------+--------+----------------+---------+---------+------+-------+--------------------------+
|  1 | PRIMARY            | h1    | ALL    | PRIMARY,day_id | NULL    | NULL    | NULL | 65310 |                          |
|  1 | PRIMARY            | p     | eq_ref | PRIMARY        | PRIMARY | 110     | func |     1 | Using where              |
|  2 | DEPENDENT SUBQUERY | h2    | range  | PRIMARY,day_id | day_id  | 4       | NULL | 21824 | Using where; Using index |
+----+--------------------+-------+--------+----------------+---------+---------+------+-------+--------------------------+

Best Answer

I think that better indexes for these queries are these (which you don't have on the tables now): (player_id, day_id) and (target_id, day_id)

The optimizer choosing different indexes in the two situations has probably to due with table sizes and selectivity of the available indexes. I suggest you add the two indexes above and check (and compare) the new execution plans and timings with the old ones.

Update For InnoDB tables, the (target_id) and (target_id, day_id) indexes should be equivalent as all non clustered indexes also include the columns of the clustered index. It could be a blind spot on the optimizer, not identifying that the (target_id) index can be used as if it was a (target_id, day_id) index. But seeing that the similar index in the first case is chosen, the problem must be somewhere else.

And it is: PicPcrHistory.target_id is of type varchar(36) and so is Pictures.id But one has utf8 and the other latin1 charset. So, there is your problem and why the index is not used. The two columns that are joined are not identical (type and charset.)

Change the charset of either one to match the other charset and the index will be used.


There are also other ways to write the queries - which appear to be of the [greatest-n-per-group] type:

  • Using a derived table:

    SELECT p.id, 
           p.pcr_score + 0.5 * COALESCE(h1.total,0) AS pcr_score
    FROM 
        fpme_lua.Pictures AS p
      JOIN 
        fpme_lua.PicPcrHistory AS h1
          ON  h1.target_id = p.id 
      JOIN
        ( SELECT   hh.target_id,
                   Max(hh.day_id) AS day_id
          FROM     fpme_lua.PicPcrHistory AS hh
          WHERE    hh.day_id <= 15786 
          GROUP BY hh.target_id
        ) AS h2
          ON  h2.target_id = h1.target_id
          AND h2.day_id = h1.day_id ;
    
  • Using a correlated subquery for the join condition (this is similar to the queries you have):

    SELECT p.id, 
           p.pcr_score + 0.5 * COALESCE(h1.total,0) AS pcr_score
    FROM 
        fpme_lua.Pictures AS p
      JOIN 
        fpme_lua.PicPcrHistory AS h1
          ON  h1.target_id = p.id 
          AND h1.day_id =
            ( SELECT   hh.day_id
              FROM     fpme_lua.PicPcrHistory AS hh
              WHERE    hh.target_id = p.id
                AND    hh.day_id <= 15786 
              ORDER BY hh.target_id DESC
                  LIMIT 1
            ) ;
    

And index on (target_id, day_id, total) will be even better for the last query.