Postgresql – How to prevent seek pagination from doing sequential scan over entire table

execution-planpagingpostgresqlpostgresql-performancequery-performance

I have 3 tables and a materialized view:

resource_categories contains all category names and metadata

create table if not exists resource_categories (
    category_id INT,
    title VARCHAR(255),
    content TEXT,
    icon VARCHAR(50)
);

resources contains all resources under each category

create table if not exists resources (
    resource_id INT,
    title VARCHAR(255),
    content TEXT,
    link VARCHAR(1000),
    category_id INT REFERENCES resource_categories(resource_id),
    icon VARCHAR(255),
    created_at DATE,
    updated_at DATE
);

resource_votes contains user liking a resource or disliking it

create table if not exists resource_votes (
    resource_id INT REFERENCES resources(resource_id),
    user_id INT,
    vote BOOLEAN
);

resource_votes_aggregate a materialized view with likes per resource_id

CREATE materialized view resource_votes_aggregate AS 
SELECT
   resource_id,
   COUNT(
   CASE
      WHEN
         vote = TRUE 
      THEN
         1 
   END
) AS likes 
FROM
   resource_votes 
GROUP BY
   resource_id;
  • I want to execute the following queries
  • Find all resources sorted in descending order of likes (find most liked resources)
  • Find all resources sorted in descending alphabetical order of their titles
  • I want to use SEEK / KEYSET pagination for efficiency

Query to find resources in descending order of their likes page 1

SELECT
   r.resource_id,
   title,
   COALESCE(likes, 0) AS likes 
FROM
   resources r 
   LEFT JOIN
      resource_votes_aggregate a 
      ON r.resource_id = a.resource_id 
ORDER BY
   likes DESC,
   resource_id DESC LIMIT 5;

Execution Plan

QUERY PLAN
Limit (cost=74.50..74.52 rows=5 width=157) (actual time=0.058..0.060 rows=5 loops=1)
-> Sort (cost=74.50..76.80 rows=918 width=157) (actual time=0.058..0.058 rows=5 loops=1)
Sort Key: (COALESCE(a.likes, '0'::bigint)) DESC, r.resource_id DESC
Sort Method: top-N heapsort Memory: 25kB
-> Hash Right Join (cost=12.03..59.25 rows=918 width=157) (actual time=0.032..0.046 rows=50 loops=1)
Hash Cond: (a.resource_id = r.resource_id)
-> Seq Scan on resource_votes_aggregate a (cost=0.00..30.40 rows=2040 width=12) (actual time=0.001..0.004 rows=38 loops=1)
-> Hash (cost=10.90..10.90 rows=90 width=149) (actual time=0.021..0.021 rows=50 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> Seq Scan on resources r (cost=0.00..10.90 rows=90 width=149) (actual time=0.003..0.010 rows=50 loops=1)
Planning Time: 0.050 ms
Execution Time: 0.075 ms

This and all the other queries generate a sequential scan which beats the entire purpose of seek pagination

  • How do I fix this. Help is super appreciated

LINK TO DB FIDDLE

Best Answer

There may be more potential for improvement. But the obvious big issue is this:

SELECT
   r.resource_id,
   title,
   COALESCE(likes, 0) AS likes
FROM
   resources r 
   LEFT JOIN
      resource_votes_aggregate a 
      ON r.resource_id = a.resource_id 
ORDER BY
   likes DESC,               -- refers to output column!
   resource_id DESC
LIMIT 5;

The manual:

If an ORDER BY expression is a simple name that matches both an output column name and an input column name, ORDER BY will interpret it as the output column name. This is the opposite of the choice that GROUP BY will make in the same situation. This inconsistency is made to be compatible with the SQL standard.

So even if there is an applicable index including resource_votes_aggregate.likes, it cannot be used as the column is hidden behind the expression COALESCE(likes, 0) AS likes - reflected in the query plan as:

Sort Key: (COALESCE(a.likes, '0'::bigint)) DESC

Fix with:

SELECT
   r.resource_id,
   r.title,
   COALESCE(a.likes, 0) AS likes
FROM
   resources r 
   LEFT JOIN
      resource_votes_aggregate a 
      ON r.resource_id = a.resource_id 
ORDER BY
   a.likes DESC NULLS LAST,  -- refers to input column!
   r.resource_id DESC
LIMIT 5;

This refers to the input column now and qualifies for index usage. It's substantially cheaper to evaluate in any case, even without index, so you can't go wrong with this.

Since likes obviously can be NULL, be sure to add NULLS LAST. See:

Table-qualification for all input columns (like I added) is a good way to avoid confusion - not only for the problem at hand.

MATERIALIZED VIEW

The MV in your fiddle would be more efficient like this:

CREATE MATERIALIZED VIEW resource_votes_aggregate AS
SELECT resource_id
     ,(count(*) FILTER (WHERE vote))::int AS likes  -- faster
FROM   resource_votes
GROUP  BY resource_id;

The aggregate FILTER is typically faster.
count() returns bigint (8 bytes). Cast to integer (4 bytes).

This is the all-important index (missing in your fiddle):

CREATE INDEX ON resource_votes_aggregate (likes DESC, resource_id DESC);

With two integer columns, index tuples fit the minimum size of 8 bytes. (With bigint for likes, it would be 16 bytes (12 + 4 padding). See:

Obviously, you also need an index on resources(resources_id). If that's the PK, then it's there.

If you use this query a lot, and preconditions for index-only scans are given, a multicolumn index on (resources_id) INCLUDE (title) may pay. See:

Optimized query

If Postgres is not smart enough to give you the best query plan, this equivalent query will:

(
SELECT r.resource_id, r.title, a.likes -- COALESCE not needed
FROM   resource_votes_aggregate a
JOIN   resources r USING (resource_id)
ORDER  BY a.likes DESC                 -- NULLS LAST not needed
        , a.resource_id DESC           -- perfect for above index
LIMIT  5  -- logically redundant, but may help with best plan
)
UNION ALL
(
SELECT resource_id, title, 0 AS likes 
FROM   resources r
WHERE  NOT EXISTS (  -- only rows without likes
   SELECT FROM resource_votes_aggregate a
   WHERE  a.resource_id = r.resource_id
   )
ORDER  BY r.resource_id DESC
LIMIT  5  -- logically redundant, but may help with best plan
)
LIMIT 5

The second SELECT will typically be never executed, if there are at least 5 resources with likes. See:

Aside

You mentioned pagination. Be sure to avoid LIMIT / OFFSET for big tables. See: