PostgreSQL Query Performance – Slow Query with Pagination and Filtering

explainoptimizationpostgresqlquery-performance

For the DB in question, I have followed the advice of Pagination Done the
PostgreSQL Way

That is, results are being paginated on a composite index with a definite sort order.

Now, my problem is that when this is combined with a where-clause filter on a join table, the query becomes very slow.

How can I make my query faster in the filtering case?

Note: pagination_index is a unique index on sort_key_1, sort_key_2, sort_key_3.

Query without filter

EXPLAIN (ANALYZE, BUFFERS) SELECT *
FROM big_table

join small_table on small_table_id = small_table.id
--WHERE small_table.filter_column = 9173

ORDER BY sort_key_1, sort_key_2, sort_key_3
LIMIT 20

Limit  (cost=0.85..124.26 rows=20 width=293) (actual time=49.947..50.254 rows=20 loops=1)
  Buffers: shared hit=64 read=1
  ->  Nested Loop  (cost=0.85..318749153.46 rows=51655908 width=293) (actual time=49.942..50.189 rows=20 loops=1)
        Buffers: shared hit=64 read=1
        ->  Index Scan using pagination_index on big_table  (cost=0.56..106181666.21 rows=51655908 width=113) (actual time=0.019..0.054 rows=20 loops=1)
              Buffers: shared hit=5
        ->  Index Scan using small_table_pkey on small_table  (cost=0.28..4.11 rows=1 width=180) (actual time=2.498..2.500 rows=1 loops=20)
              Index Cond: (id = big_table.small_table_id)
              Buffers: shared hit=59 read=1
Total runtime: 92.107 ms

Query with filter

EXPLAIN (ANALYZE, BUFFERS) SELECT *
FROM big_table

join small_table on small_table_id = small_table.id
WHERE small_table.filter_column = 9173

ORDER BY sort_key_1, sort_key_2, sort_key_3
LIMIT 20

Limit  (cost=0.85..79443.10 rows=20 width=293) (actual time=1331003.261..2376282.943 rows=16 loops=1)
  Buffers: shared hit=45840126 read=1148265
  ->  Nested Loop  (cost=0.85..108506196.66 rows=27317 width=293) (actual time=1331003.257..2376282.873 rows=16 loops=1)
        Join Filter: (big_table.small_table_id = small_table.id)
        Rows Removed by Join Filter: 154967708
        Buffers: shared hit=45840126 read=1148265
        ->  Index Scan using pagination_index on big_table  (cost=0.56..106181666.21 rows=51655908 width=113) (actual time=0.010..1544834.357 rows=51655908 loops=1)
              Buffers: shared hit=45840121 read=1148265
        ->  Materialize  (cost=0.28..14.60 rows=3 width=180) (actual time=0.002..0.007 rows=3 loops=51655908)
              Buffers: shared hit=5
              ->  Index Scan using small_table_id_index on small_table  (cost=0.28..14.59 rows=3 width=180) (actual time=0.008..0.017 rows=3 loops=1)
                    Index Cond: (filter_column = 9173)
                    Buffers: shared hit=5
Total runtime: 2376283.041 ms

Query with filter but no order by

This is relatively fast also.

EXPLAIN (ANALYZE, BUFFERS) SELECT *
FROM big_table

join small_table on small_table_id = small_table.id
WHERE small_table.filter_column = 9173

--ORDER BY sort_key_1, sort_key_2, sort_key_3
LIMIT 20

Limit  (cost=0.56..78.57 rows=20 width=293) (actual time=0.035..100.824 rows=16 loops=1)
  Buffers: shared hit=18 read=132
  ->  Nested Loop  (cost=0.56..106540.40 rows=27317 width=293) (actual time=0.031..100.759 rows=16 loops=1)
        Buffers: shared hit=18 read=132
        ->  Seq Scan on small_table  (cost=0.00..200.91 rows=3 width=180) (actual time=0.014..10.124 rows=3 loops=1)
              Filter: (filter_column = 9173)
              Rows Removed by Filter: 5670
              Buffers: shared hit=2 read=128
        ->  Index Scan using big_table_small_table_index on big_table  (cost=0.56..35179.54 rows=26696 width=113) (actual time=30.168..30.183 rows=5 loops=3)
              Index Cond: (small_table_id = small_table.id)
              Buffers: shared hit=16 read=4
Total runtime: 100.901 ms

Best Answer

I now believe my question to be a duplicate of Slow ORDER BY with LIMIT.

I simplified the query (taking out the complication of a join table) to what matters.

Fast query

explain (analyze, buffers)
SELECT *
FROM big_table

where small_table_id = 822573
ORDER BY sort_key_1, sort_key_2, sort_key_3


Sort  (cost=24642.79..24686.64 rows=17540 width=113) (actual time=0.983..1.471 rows=227 loops=1)
  Sort Key: sort_key_1, sort_key_2, transmitter_id
  Sort Method: quicksort  Memory: 56kB
  Buffers: shared hit=1 read=8
  ->  Index Scan using big_table_small_table_index on big_table  (cost=0.56..23406.36 rows=17540 width=113) (actual time=0.055..0.514 rows=227 loops=1)
        Index Cond: (small_table_id = 822573)
        Buffers: shared hit=1 read=8
Total runtime: 1.941 ms

Slow query

Note: only change is the addition of a LIMIT clause.

explain (analyze, buffers)
SELECT *
FROM big_table

where small_table_id = 822573
ORDER BY sort_key_1, sort_key_2, sort_key_3

LIMIT 1


Limit  (cost=0.56..6061.27 rows=1 width=113) (actual time=206737.547..206737.550 rows=1 loops=1)
  Buffers: shared hit=13259226 read=303563
  ->  Index Scan using pagination_index on big_table  (cost=0.56..106304766.93 rows=17540 width=113) (actual time=206737.541..206737.541 rows=1 loops=1)
        Filter: (small_table_id = 822573)
        Rows Removed by Filter: 15520057
        Buffers: shared hit=13259226 read=303563
Total runtime: 206737.643 ms

Fast query, with limit

As suggested in the linked question, disabling index scans makes the limited query fast:

SET enable_indexscan = OFF;

explain (analyze, buffers)
SELECT *
FROM big_table

where small_table_id = 822573
ORDER BY sort_key_1, sort_key_2, sort_key_3

LIMIT 1

Limit  (cost=62302.77..62302.77 rows=1 width=113) (actual time=0.897..0.899 rows=1 loops=1)
  Buffers: shared hit=9
  ->  Sort  (cost=62302.77..62346.62 rows=17540 width=113) (actual time=0.894..0.894 rows=1 loops=1)
        Sort Key: sort_key_1, sort_key_2, transmitter_id
        Sort Method: top-N heapsort  Memory: 25kB
        Buffers: shared hit=9
        ->  Bitmap Heap Scan on big_table  (cost=388.50..62215.07 rows=17540 width=113) (actual time=0.040..0.477 rows=227 loops=1)
              Recheck Cond: (small_table_id = 822573)
              Buffers: shared hit=9
              ->  Bitmap Index Scan on big_table_small_table_index  (cost=0.00..384.11 rows=17540 width=0) (actual time=0.030..0.030 rows=227 loops=1)
                    Index Cond: (small_table_id = 822573)
                    Buffers: shared hit=4
Total runtime: 0.932 ms

Fast query, with Common Table Expression

The following query is also fast:

SET enable_indexscan = ON;

explain (analyze, buffers)
with t as (
SELECT *
FROM big_table

where small_table_id = 822573
ORDER BY sort_key_1, sort_key_2, sort_key_3
)
select * from t
LIMIT 1

Limit  (cost=24686.64..24686.66 rows=1 width=3172) (actual time=1.011..1.013 rows=1 loops=1)
  Buffers: shared hit=9
  CTE t
    ->  Sort  (cost=24642.79..24686.64 rows=17540 width=113) (actual time=1.001..1.001 rows=1 loops=1)
          Sort Key: big_table.sort_key_1, big_table.sort_key_2, sort_key_3
          Sort Method: quicksort  Memory: 56kB
          Buffers: shared hit=9
          ->  Index Scan using big_table_small_table_index on big_table  (cost=0.56..23406.36 rows=17540 width=113) (actual time=0.025..0.504 rows=227 loops=1)
                Index Cond: (small_table_id = 822573)
                Buffers: shared hit=9
  ->  CTE Scan on t  (cost=0.00..350.80 rows=17540 width=3172) (actual time=1.007..1.007 rows=1 loops=1)
        Buffers: shared hit=9
Total runtime: 1.070 ms