Postgresql – Bad execution plan when having partial index and big In-clause in Postgres

execution-planindexpostgresqlpostgresql-11

Postgres seems to always use sequential scan where it could have used a partial index to get index scan only. It only happens when an in-clause exceeds more than 100 elements.

Given the following table:

create table foo(id bigint primary key, bar bigint); 

insert into foo (id, bar) 
select g.id, case when id % 1000 = 0 then id else null end
from generate_series(1, 10000000) AS g (id) ;

--Create partial index
create unique index ix_foo_bar on foo(bar) where bar is not null;

analyze foo;

And the given the following query with a large in-statement:

explain analyze select count(*) from foo where bar in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101);

The query plan shows a sequential scan. It is slow and and have a high cost:

QUERY PLAN                                                                             
------------------------------------
 Finalize Aggregate  (cost=612955.35..612955.36 rows=1 width=8) (actual time=254.605..254.605 rows=1 loops=1)
   ->  Gather  (cost=612955.13..612955.34 rows=2 width=8) (actual time=254.474..258.242 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=611955.13..611955.14 rows=1 width=8) (actual time=247.743..247.744 rows=1 loops=3)
               ->  Parallel Seq Scan on foo  (cost=0.00..611955.03 rows=42 width=0) (actual time=247.740..247.740 rows=0 loops=3)
                     Filter: (bar = ANY ('{1,2,3,4,5(...)
                     Rows Removed by Filter: 3333333
 Planning Time: 0.867 ms
 Execution Time: 258.323 ms

set enable_seqscan has no effect – it still does a sequential scan.

If I add a 'not null' to the query, it uses the index:

explain analyze select count(*) from foo where bar is not null and bar in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101);

Index only scan:

    QUERY PLAN                                                                                                                                                                        
----------------------------------------------------------------------
 Aggregate  (cost=153.55..153.56 rows=1 width=8) (actual time=0.267..0.267 rows=1 loops=1)
   ->  Index Only Scan using ix_foo_bar on foo  (cost=0.29..153.55 rows=1 width=0) (actual time=0.262..0.262 rows=0 loops=1)
         Index Cond: (bar = ANY ('{1,2,3,4,5,6 (...), 101}'::bigint[]))
         Heap Fetches: 0
 Planning Time: 0.531 ms
 Execution Time: 0.319 ms
(6 rows)

It also uses the index if I only have fewer elements in the in-clause (the cut-off is at 100 vs 101), or if I have a complete index instead of a partial one.

Why doesn't Postgres use the partial index when I have an in-clause with more than 100 elements? Is this a known limitation of the query planner, or is it a bug?

Best Answer

This will be fixed in version 12, which will be released soon.

I think the summary here is that we are only willing to do so much work to try to prove a partial index can be used, because all queries have to go through that work even if they don't end up using the partial index. In this change, they just found a more efficient way to do that work in this particular case, and so no longer impose the 100-element limit to it.