We upgraded lately our database from PostgreSQL 9.6 to 11 on RDS (after the upgrade we did vacuum full
and analyze
for the DB).
We noticed that a frequent query changed its execution plan, and now it is not using the correct indices.
SELECT "posts".*
FROM "posts"
WHERE "posts"."delete_at" IS NULL
AND "posts"."tenant_id" = 1582
AND "posts"."approved" = TRUE
AND (posts.status in (0, 1, 2, 3, 6, 9))
ORDER BY id desc
LIMIT 3000 OFFSET 0
The query is doing lots of filters and has limit
+offset
for pagination.
The most helpful filters here are tenant_id
and status
, on which we have a compound index for it.
In Postgres 9.6 the execution plan is:
Limit (cost=36057.09..36064.59 rows=3000 width=1082) (actual time=7.159..8.404 rows=2018 loops=1)
-> Sort (cost=36057.09..36130.91 rows=29529 width=1082) (actual time=7.159..7.664 rows=2018 loops=1)
Sort Key: id DESC
Sort Method: quicksort Memory: 1091kB
-> Index Scan using index_posts_on_tenant_id_and_status on posts (cost=0.56..34204.03 rows=29529 width=1082) (actual time=0.022..5.052 rows=2018 loops=1)
Index Cond: ((tenant_id = 1582) AND (status = ANY ('{0,1,2,3,6,9}'::integer[])))
Filter: ((delete_at IS NULL) AND approved)
Planning time: 0.289 ms
Execution time: 8.856 ms
The execution time is fast and we are using our compound index.
In Postgres 11:
Limit (cost=1000.49..30635.60 rows=3000 width=1149) (actual time=7.064..2100.604 rows=2479 loops=1)
-> Gather Merge (cost=1000.49..1138801.15 rows=115181 width=1149) (actual time=7.062..2101.843 rows=2479 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Index Scan Backward using posts_pkey on posts (cost=0.43..1124081.93 rows=28795 width=1149) (actual time=1.831..1679.456 rows=496 loops=5)
Filter: ((delete_at IS NULL) AND approved AND (tenant_id = 1582) AND (status = ANY ('{0,1,2,3,6,9}'::integer[])))
Rows Removed by Filter: 1187090
Planning Time: 1.134 ms
Execution Time: 2102.218 ms
We are not using the compound index and get poor performance.
We did some experiments and found interesting things.
First experiment – the tenant has 2400-2500 posts; if we change the limit to be 2400 and not 3000 (the default one):
Limit (cost=1000.49..24708.58 rows=2400 width=1149) (actual time=7.716..39.017 rows=2400 loops=1)
-> Gather Merge (cost=1000.49..1138801.15 rows=115181 width=1149) (actual time=7.715..38.848 rows=2400 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Index Scan Backward using posts_pkey on posts (cost=0.43..1124081.93 rows=28795 width=1149) (actual time=2.092..23.554 rows=494 loops=5)
Filter: ((delete_at IS NULL) AND approved AND (tenant_id = 1582) AND (status = ANY ('{0,1,2,3,6,9}'::integer[])))
Rows Removed by Filter: 12398
Planning Time: 0.271 ms
Execution Time: 39.158 ms
We get a fast execution time and still using the wrong index.
The second experiement – and this is the interesting one – we change the order by to be created_at
and not id
(it is important to note: we have an index on id DESC
for this kind of query):
Limit (cost=193711.00..193991.02 rows=2400 width=1149) (actual time=5.473..7.277 rows=2400 loops=1)
-> Gather Merge (cost=193711.00..204909.92 rows=95984 width=1149) (actual time=5.472..7.093 rows=2400 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=192710.98..192830.96 rows=47992 width=1149) (actual time=2.901..2.948 rows=816 loops=3)
Sort Key: created_at DESC
Sort Method: quicksort Memory: 778kB
Worker 0: Sort Method: quicksort Memory: 273kB
Worker 1: Sort Method: quicksort Memory: 327kB
-> Parallel Index Scan using index_posts_on_tenant_id_and_status on posts (cost=0.56..189776.55 rows=47992 width=1149) (actual time=0.052..2.045 rows=826 loops=3)
Index Cond: ((tenant_id = 1582) AND (status = ANY ('{0,1,2,3,6,9}'::integer[])))
Filter: ((delete_at IS NULL) AND approved)
Planning Time: 0.279 ms
Execution Time: 7.473 ms
We get a fast execution time and are using the correct compound index.
We don't know why this happens, and don't want to change our code for the new order by since we want to understand what's going on here.
We tried:
- running
analyze
on the posts table – didn't help. - upgrading to strong machine 2x memory and 2x CPU – didn't help.
- we removed 50% of rows which are marked as
delete_at
and then run againanalyze
– didn't help. - Used
CREATE STATISTICS
withdependencies
ontenant_id
andstatus
and rananalyze
– and got the exact same execution plan. - Disabled parallelism on Postgres 11 – got 2x slower execution time.
Any help will be appreciated.
Best Answer
Your query is of this shape:
There are two approaches to process a query like that
Use indexes to speed up
<condition>
and sort the result.Use an index on
<expression>
to get the results in sorted order and discard everything that does not satisfy<condition>
until we reach theLIMIT
.Now it seems that the cost estimates for both paths are pretty close: 36000 vs. 30500, and the estimate in the latter case is so low only because you have 5 cores working on the query. You must have increased
max_parallel_workers_per_gather
, and from the "short" execution time of 2 seconds I even suspect that you artificially increased the degree of parallelism by setting theparallel_workers
storage parameter on the table.Moreover, PostgreSQL 9.6 doesn't have parallel index scan, so it isn't even tempted to choose the second path, because with one process the index scan would take longer.
Now that strategy does not work out: In reality, there are only 2479 matching rows rather than the estimated 28795, so the index scan cannot stop early, but has to scan the complete index, because the
LIMIT
is never reached. Index scans are slow, so the plan performs much worse than PostgreSQL thought.In a way, the increased capabilities in v11 opened the door to yet another way to process the query that turned out to be a trap.
What can you do to put PostgreSQL on the right track:
The crude method: change the
ORDER BY
clause so that it doesn't match the indexed expression:Then the bad road is blocked. But it can also not be taken in cases where it would be the better road...
The better method: reduce the degree of parallelism, either by lowering
max_parallel_workers_per_gather
globally or by setting theparallel_workers
storage parameter on the table to a lower value.Then perhaps that plan will become less attractive. It may not work out, because the other plan could also be parallelized.
Maybe the best method: Try to fix the row count mis-estimate so that PostgreSQL knows what to expect.
Then see if the estimate is better.
If that does not do the trick, try extended statistics: