Mysql – Self join increase paging query performance

MySQLoptimization

I saw a question on stackoverflow
mysql self join performance. The question mentioning that self join will increase query performance. For example this query

select m2.* from message m1,message m2
where m1.id=m2.id and m1.thumbs_up <=98
and (m1.id<13 or m1.thumbs_up<98)
order by m1.thumbs_up desc,m1.id desc

explain result

+----+-------------+-------+--------+-----------------------+---------------+---------+------------+-------+--------------------------+
| id | select_type | table | type   | possible_keys         | key           | key_len | ref        | rows  | Extra                    |
+----+-------------+-------+--------+-----------------------+---------------+---------+------------+-------+--------------------------+
|  1 | SIMPLE      | m1    | range  | PRIMARY,thumbs_up_key | thumbs_up_key | 4       | NULL       | 21574 | Using where; Using index |
|  1 | SIMPLE      | m2    | eq_ref | PRIMARY               | PRIMARY       | 4       | test.m1.id |     1 | NULL                     |
+----+-------------+-------+--------+-----------------------+---------------+---------+------------+-------+--------------------------+

have better performance than

select *
from message
where thumbs_up <=98 and (id<13 or thumbs_up<98)
order by thumbs_up desc,id desc

explain result

+----+-------------+---------+------+-----------------------+------+---------+------+-------+-----------------------------+
| id | select_type | table   | type | possible_keys         | key  | key_len | ref  | rows  | Extra                       |
+----+-------------+---------+------+-----------------------+------+---------+------+-------+-----------------------------+
|  1 | SIMPLE      | message | ALL  | PRIMARY,thumbs_up_key | NULL | NULL    | NULL | 43148 | Using where; Using filesort |
+----+-------------+---------+------+-----------------------+------+---------+------+-------+-----------------------------+

I tried the query using profiling and yes the performance is actually better specially in create sort index. Is it possible to explain why as I found it interesting thing to know (I tried using explain query but didn't find good explanation).

update:

create table query

CREATE TABLE `message`(
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `title` varchar(255) NOT NULL,
    `user_id` int(11) NOT NULL,
    `content` text NOT NULL,
    `create_time` int(11) NOT NULL,
    `thumbs_up` int(11) NOT NULL DEFAULT '0', /*Vote Count*/
    PRIMARY KEY (`id`),
    KEY `thumbs_up_key` (`thumbs_up`,`id`)
)

Also I removed all indexes except the primary key index, the performance difference was huge the like double the speed.

Best Answer

The likely reason for the difference you are seeing: The 2nd must haul * around in a tmp table and sort it, whereas the 1st avoids such.

You have a fully ordered, but 2-part, index. The Optimizer will do a good job if you are going forward from where you left off, but this is a different situation -- You want every row before some particular row (98:13).

Give this a try; it may trick the Optimizer into using the index for both the WHERE and the ORDER BY:

select *
    from message
    where thumbs_up <=98
    HAVING (id<13 or thumbs_up<98)
    order by thumbs_up desc, id desc

I think it will say the 'good' Using index but not the 'bad' Using filesort.

Another technique for comparing two formulations:

FLUSH STATUS;
SELECT SQL_NO_CACHE ...;
SHOW SESSION STATUS LIKE 'Handler%';