Mysql – Why is ORDER BY LIMIT 1 so much faster than MAX

aggregateinnodbMySQLorder-byperformance

Here are my two queries:

select sts_in 
from sta_session 
where sts_user_id=2006 
AND sts_sessid!='0jitkt80gg3avere03tqk4lhi6' 
order by sts_in desc limit 1;

And

select max(sts_in) 
from sta_session 
where sts_user_id=2006 
AND sts_sessid!='0jitkt80gg3avere03tqk4lhi6';

For reference, the table looks like this:

CREATE TABLE `sta_session` (
    `sta_session_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    `sts_sessid` varchar(255) NOT NULL COMMENT 'PHP session ID',
    `sts_user_id` int(10) unsigned NOT NULL,
    `sts_in` bigint(20) NOT NULL,
    `sts_out` bigint(20) NOT NULL,
    `sts_ip` varchar(255) NOT NULL,
    `sts_browser_id` int(10) unsigned NOT NULL,
    PRIMARY KEY (`sta_session_id`),
    KEY `sts_sessid` (`sts_sessid`),
    KEY `sts_user_id` (`sts_user_id`),
    KEY `sts_ip` (`sts_ip`),
    KEY `fk_sta_browser_id` (`sts_browser_id`),
    KEY `idx_last_login` (`sts_user_id`,`sts_in`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

If I run explain on both the queries, the first one (with order by) uses my special index that I created just for this query, idx_last_login, but the other one uses sts_user_id.

Can the MAX query not take advantage of idx_last_login? Why not?

I know it can use that index, because I can force it, but I suspect it's ignoring the sts_in part of it and just searching by user id.


The EXPLAINs, in JSON format for easy reading.

explain select sts_in from sta_session where sts_user_id=2006 AND sts_sessid!='0jitkt80gg3avere03tqk4lhi6' order by sts_in desc limit 1;

[{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "sta_session",
  "type": "ref",
  "possible_keys": "sts_sessid,sts_user_id,idx_last_login",
  "key": "idx_last_login",
  "key_len": "4",
  "ref": "const",
  "rows": 723106,
  "Extra": "Using where"
 }
]


explain select max(sts_in) from sta_session where sts_user_id=2006 AND sts_sessid!='0jitkt80gg3avere03tqk4lhi6';

[{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "sta_session",
  "type": "ref",
  "possible_keys": "sts_sessid,sts_user_id,idx_last_login",
  "key": "sts_user_id",
  "key_len": "4",
  "ref": "const",
  "rows": 723107,
  "Extra": "Using where"
 }
]

Best Answer

It makes a lot of sense to me.

The MySQL Query Optimizer looks over the WHERE, GROUP BY and ORDER BY clauses.

Look at the first query

select sts_in 
from sta_session 
where sts_user_id=2006 
AND sts_sessid!='0jitkt80gg3avere03tqk4lhi6' 
order by sts_in desc limit 1;

Which index in sta_session has the most columns mentioned in the WHERE, GROUP BY, and ORDER BY clauses ? idx_last_login

How does MySQL locate the MAX value ?

  • Traverse sts_user_id in idx_last_login index for 2006
  • Go to the Last Index Entry for sts_in for sts_user_id 2006
  • Scroll backwards until sts_sessid!='0jitkt80gg3avere03tqk4lhi6'
  • Limit result to 1 row

Look at the second query

select max(sts_in) 
from sta_session 
where sts_user_id=2006 
AND sts_sessid!='0jitkt80gg3avere03tqk4lhi6';

Which index in sta_session has the most columns mentioned in the WHERE, GROUP BY, and ORDER BY clauses ? Not idx_last_login, but sts_user_id.

How does MySQL locate the MAX value ?

  • Full Index Range Scan sts_user_id for 2006
  • Aggregate sts_in from the table, comparing all values of sts_sessid <> '0jitkt80gg3avere03tqk4lhi6'

POSSIBLE WORKAROUND

Since sts_user_id and idx_last_login are duplicate indexes (because they have the same first column), you should run

ALTER TABLE sta_session DROP INDEX sts_user_id;

There is a possibility for idx_last_login to be chosen and aggregate sts_in. However, there is still some traversal across the table.

If you really want to get aggressive with indexing, create this one

ALTER TABLE sta_session ADD INDEX everything_and_the_kitchen_sink_index
(sts_user_id,sts_in,sts_sessid);

This index might get selected and never have to touch the table because all columns (WHERE, ORDER BY, MAX()) are in the index.

GIVE IT A TRY !!!