Mysql – Pagination: Slow ORDER BY with large LIMIT range

indexMySQLorder-byperconaperformance

I have a forum that run this query to do pagination for category pages:

mysql> SELECT  thread.threadid,thread.lastpost, thread.lastposter,
               thread.lastpostid, thread.replycount, 
               IF(thread.views<=thread.replycount,
                  thread.replycount+1, thread.views) AS views
    ->
    ->                 FROM thread AS thread
    ->                 WHERE forumid = 21
    ->                 AND sticky = 0
    ->
    ->                 AND visible IN (0,1,2)
    ->
    ->                 ORDER BY sticky DESC, thread.dateline DESC
    ->                 LIMIT 950000, 25;

    25 rows in set (2.64 sec)

The large the limit, the more it takes to complete making the page slow to load.

I have already added indexes, however it doesn't make much difference in performance.

Thus far i have tried rewriting the query using JOIN which improves a bit, however i was wondering if there is any better aproach to optimize this query.

Unfortunately i can't use WHERE because threadids aren't sequential.

mysql> SELECT  thread.threadid,thread.lastpost, thread.lastposter,
               thread.lastpostid, thread.replycount,
               IF(thread.views<=thread.replycount,
                  thread.replycount+1, thread.views) AS views
    ->                 FROM thread AS thread
    ->                 JOIN (SELECT thread.threadid
    ->                 FROM   thread as thread
    ->                 WHERE forumid = 21
    ->                         AND sticky = 0
    ->                          AND visible IN (0,1,2)
    ->                 ORDER BY sticky DESC, thread.dateline DESC
    ->                 LIMIT 950550, 25) AS O2
    ->                 ON thread.threadid = O2.threadid
    ->                 ORDER BY sticky DESC, thread.dateline DESC
    -> ;

25 rows in set (1.49 sec)

EXPLAIN

+----+-------------+--------+------+---------------------+---------+---------+-------+---------+-----------------------------+
| id | select_type | table  | type | possible_keys       | key     | key_len | ref   | rows    | Extra                       |
+----+-------------+--------+------+---------------------+---------+---------+-------+---------+-----------------------------+
|  1 | SIMPLE      | thread | ref  | forumid2_dp | forumid2_dp | 2       | const | 1378739 | Using where; Using filesort |
+----+-------------+--------+------+---------------------+---------+---------+-------+---------+-----------------------------+
1 row in set (0.00 sec)

The table has 4.5m rows.

Table schema:

| thread | CREATE TABLE `thread` (
  `threadid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(250) NOT NULL DEFAULT '',
  `prefixid` varchar(25) NOT NULL DEFAULT '',
  `firstpostid` int(10) unsigned NOT NULL DEFAULT '0',
  `lastpostid` int(10) unsigned NOT NULL DEFAULT '0',
  `lastpost` int(10) unsigned NOT NULL DEFAULT '0',
  `forumid` smallint(5) unsigned NOT NULL DEFAULT '0',
  `pollid` int(10) unsigned NOT NULL DEFAULT '0',
  `open` smallint(6) NOT NULL DEFAULT '0',
  `replycount` int(10) unsigned NOT NULL DEFAULT '0',
  `hiddencount` int(10) unsigned NOT NULL DEFAULT '0',
  `deletedcount` int(10) unsigned NOT NULL DEFAULT '0',
  `postusername` varchar(100) NOT NULL DEFAULT '',
  `postuserid` int(10) unsigned NOT NULL DEFAULT '0',
  `lastposter` varchar(100) NOT NULL DEFAULT '',
  `dateline` int(10) unsigned NOT NULL DEFAULT '0',
  `views` int(10) unsigned NOT NULL DEFAULT '0',
  `iconid` smallint(5) unsigned NOT NULL DEFAULT '0',
  `notes` varchar(250) NOT NULL DEFAULT '',
  `visible` smallint(6) NOT NULL DEFAULT '0',
  `sticky` smallint(6) NOT NULL DEFAULT '0',
  `votenum` smallint(5) unsigned NOT NULL DEFAULT '0',
  `votetotal` smallint(5) unsigned NOT NULL DEFAULT '0',
  `attach` smallint(5) unsigned NOT NULL DEFAULT '0',
  `similar` varchar(55) NOT NULL DEFAULT '',
  `taglist` mediumtext,
  `vbseo_linkbacks_no` int(10) unsigned NOT NULL DEFAULT '0',
  `autoskip` smallint(6) DEFAULT '0',
  `tps_thread_points` int(10) unsigned DEFAULT '0',
  PRIMARY KEY (`threadid`),
  KEY `postuserid` (`postuserid`),
  KEY `pollid` (`pollid`),
  KEY `forumid` (`forumid`,`visible`,`sticky`,`lastpost`),
  KEY `lastpost` (`lastpost`,`forumid`),
  KEY `dateline` (`dateline`),
  KEY `prefixid` (`prefixid`,`forumid`),
  KEY `forumid2_dp` (`forumid`,`visible`,`sticky`,`dateline`)
) ENGINE=InnoDB AUTO_INCREMENT=4879868 DEFAULT CHARSET=utf8 |

Best Answer

First, I must ask what the heck LIMIT 950000, 25 is for? Is some human paginating through a zillion pages? Or is a search engine doing it? I can't imagine some human with the patience to go that far, so why even provide the capability??

(Sorry, I get irritated at poorly thought-through UIs.) Now I will get off my high horse and try to answer your question.

  • The reformuation seemed to help. But did you run the timings twice? It may be that the first run did more I/O, leaving data cached for the second.
  • The use of the subquery is a common trick, though it is unclear (to me) whether it would help in this case.
  • The mixture of IN and ORDER BY says that no index will work all the way through the ORDER BY. That is, the query cannot be optimized to the point of touching only 950000+25 rows. (Not that that is much help.)
  • The subquery ("derived table") in the second formulation takes advantage of KEY forumid2_dp (forumid,visible,sticky,dateline) because it is "covering". (Note: In InnoDB, the PK (threadid) is silently appended to every secondary key.)
  • What is the EXPLAIN for the first formulation? It should (if the Optimizer is smart enough) act mostly like the second. But I am not sure.

It is possible to make every page as fast as the first. See my blog on remembering where you left off instead of using OFFSET.

A quick fix may be to tell search engines not to follow the links here.