PostgreSQL – Selecting N Rows Before or After a Specified Row

postgresql

Let's say I have a simple table in Postgres (11.3):

create table posts
(
    id serial not null,
    created_at timestamp(0)
    constraint posts_pkey
        primary key (id)
);

If a user requests id=5869, I need to be able to return the N rows before and the N rows after that row in a query that is ordered by the created_at column. If we're able to assume that the larger the id, the larger the created_at, we can do something relatively simple like this:

(select * from posts where id < 5869 order by id limit 10)
union all
(select * from posts where id >= 5869 order by id limit 11);

However, I am unable to assume that the higher id is most recently created and I'm wondering what the best way to retrieve that data is in that case. This method works, but is remarkably slow on a 100k row dataset:

WITH 
  boundaries AS (
    SELECT *,
           row_number() OVER (ORDER BY created_at DESC) AS rownum
    FROM posts
  ),
  target_boundary AS (
     SELECT *
     FROM boundaries
     WHERE boundaries.id = 5869
 )
SELECT posts.*, boundaries.rownum
FROM posts
LEFT JOIN boundaries ON posts.id = boundaries.id
JOIN target_boundary ON boundaries.rownum BETWEEN target_boundary.rownum - 10 AND target_boundary.rownum + 10

Running through that was taking upwards of 800 milliseconds which is far too slow on a dataset so small.

I have also tried a variation of the above using lead() and lag(), but that was even less efficient.

Is there a better way to do this query? Is there perhaps a window function I am missing in Postgres that would handle it?

Best Answer

Use the power of UNION ALL:

WITH init AS (
   SELECT created_at
   FROM posts 
   WHERE id = 5869
)
(
   (SELECT posts.*
    FROM posts
       CROSS JOIN init
    WHERE posts.created_at <= init.created_at 
    ORDER BY posts.created_at DESC 
    LIMIT 11) 
 UNION ALL 
   (SELECT posts.* 
    FROM posts 
       CROSS JOIN init 
    WHERE posts.created_at > init.created_at 
    ORDER BY posts.created_at 
    LIMIT 10)
);

This query assumes that there are no duplicates in created_at.

For good performance, you need indexes on id (you have that with the primary key) and created_at.

If you need the result sorted, use my query as a subselect and add an ORDER BY.