PostgreSQL – Using OFFSET and LIMIT on Complex Query

optimizationperformancepostgresqlpostgresql-9.3postgresql-performance

I am doing a pretty complex query in Postgres 9.3.4:

SELECT p.*
FROM   unnest('{19082, 19075, 20705, 18328, 19110, 24965, 18329, 27600
              , 17804, 20717, 27598, 27599}'::int[]) s(source_id)
     , LATERAL (
   SELECT *
   FROM   posts
   WHERE  source_id = s.source_id
   AND    deleted_at IS NULL
   ORDER  BY external_created_at DESC
   LIMIT  100
   ) p
ORDER  BY p.external_created_at DESC
LIMIT  100;

See here for more details on my Postgres db, and settings, etc.

Which is great, unless I try to do some pagination.
To see what I mean. When I do a count, with no limit set:

SELECT p.*
FROM   unnest('{19082, 19075, 20705, 18328, 19110, 24965, 18329, 27600
              , 17804, 20717, 27598, 27599}'::int[]) s(source_id)
     , LATERAL (
   SELECT *
   FROM   posts
   WHERE  source_id = s.source_id
   AND    deleted_at IS NULL
   ORDER  BY external_created_at DESC
   ) p
ORDER  BY p.external_created_at DESC;

I get (107 rows), which is the correct number of results.

If I want to paginate, I'd use OFFSET on the subquery. If the first page has 100 posts, the second page should have 7 posts.

SELECT p.*
FROM   unnest('{19082, 19075, 20705, 18328, 19110, 24965, 18329, 27600
              , 17804, 20717, 27598, 27599}'::int[]) s(source_id)
     , LATERAL (
   SELECT *
   FROM   posts
   WHERE  source_id = s.source_id
   AND    deleted_at IS NULL
   ORDER  BY external_created_at DESC
   LIMIT  100
   OFFSET 100
   ) p
ORDER  BY p.external_created_at DESC
LIMIT  100;

but instead it returns (0 rows).

If I do the pagination on the outer query:

SELECT p.*
FROM   unnest('{19082, 19075, 20705, 18328, 19110, 24965, 18329, 27600
              , 17804, 20717, 27598, 27599}'::int[]) s(source_id)
     , LATERAL (
   SELECT *
   FROM   posts
   WHERE  source_id = s.source_id
   AND    deleted_at IS NULL
   ORDER  BY external_created_at DESC
   ) p
ORDER  BY p.external_created_at DESC
LIMIT  100
OFFSET 100;

I get (7 rows) which is correct, but then the query is insanely slow (see this question)

Is there a way to paginate this query properly, not slow it down, and get the correct results back?

Best Answer

For just the two pages, a compromise could be:

SELECT p.*
FROM   unnest('{19082, 19075, 20705, 18328, 19110, 24965, 18329, 27600
              , 17804, 20717, 27598, 27599}'::int[]) s(source_id)
     , LATERAL (
   SELECT *
   FROM   posts
   WHERE  source_id = s.source_id
   AND    deleted_at IS NULL
   ORDER  BY external_created_at DESC
   LIMIT  200   -- 100 + 100 (outer limit + offset) per source_id
   ) p
ORDER  BY p.external_created_at DESC
OFFSET 100
LIMIT  100;

Which will be considerably slower than the first 100 rows, but still much faster than your original query in your previous question.

For just 107 rows total in your example, performance will be almost identical.

Performance decreases with every later page, obviously. OFFSET is a pain for performance, generally. But it becomes a PITA when combined with your difficult query.

I would consider a materialized view with pre-calculated row numbers or the more complex (but also more potent) solution here:

Also related: