Using OFFSET in Large MySQL Tables with Sub-Query

join;MySQLperformancequery-performancesubquery

The slow performance for using OFFSET in large table has been widely discussed in various blogs, and the most efficient way is to use an INNER JOIN as

SELECT *
FROM table
INNER JOIN (
SELECT id
FROM table
LIMIT 10 OFFSET 100000)
AS results USING(id);

but I wonder what is the bennefit of INNER JOIN over a simple sub-query as

SELECT *
FROM table
WHERE id >
(SELECT id
FROM table
LIMIT 1 OFFSET 100000)
LIMIT 10;

or more convenient

SELECT *
FROM table
WHERE id IN
(SELECT id
FROM table
LIMIT 10 OFFSET 100000);

In the following example, the ORDER BY is assumed the primary key, but obviously other orders or even WHERE clause can be added. My question is about comparison in performance of INNER JOIN or a simple sub-query`.

Best Answer

In your particular case, I would use LEFT JOIN instead. Why?

In a LEFT JOIN query, you can dictate what keys you want and the order you want to fetch and join, while INNER JOIN lets the MySQL Query Optimizer decide which is best. In some cases, the EXPLAIN plan may look horrible but still give good performance.

Instead of

SELECT *
FROM table
INNER JOIN (
SELECT id
FROM table
LIMIT 10 OFFSET 100000)
AS results USING(id);

flip the query to have the keys first

SELECT B.*
FROM (SELECT id FROM table LIMIT 10 OFFSET 100000) A
LEFT JOIN table B USING (id);

From there, you can control the order by inside in subquery A

SELECT B.*
FROM (SELECT id FROM table LIMIT 10 OFFSET 100000 ORDER BY ...) A
LEFT JOIN table B USING (id);

or after USING (id)

SELECT B.*
FROM (SELECT id FROM table LIMIT 10 OFFSET 100000) A
LEFT JOIN table B USING (id)  ORDER BY ...;

This way, you should be able to navigate (or paginate) through the keys before locating actual data. To show this using empirical evidence, see my StackOverflow post (Fetching a Single Row from Join Table) on using LEFT JOIN with primary keys.

Give it a Try !!!