Postgresql – Can PostgreSQL use multiple partial indexes per query

index-tuningperformancepostgresqlpostgresql-performance

I have read that PostgreSQL can use multiple indexes in general, but in the specific case of a query that ranges across two indexes, will it use both? If so, will they be loaded sequentially or together?

For example, if this query spans across two partial indexes by column_1, how will partial indexes be used if at all, and how is the index data loaded and discarded:

SELECT 1 FROM sample_table WHERE column_1 > 50 AND column_2 < 50000

Best Answer

Very short version: Yes, sometimes.


PostgreSQL can use bitmap index scans to combine multiple indexes.

A predicate like

WHERE a > 50 AND a < 50000

is a specialisation of the more general form:

wHERE a > 50 and b < 50000

for a = b.

PostgreSQL can use two indexes here, one for each part of the predicate, and then bitmap AND them. It doesn't matter if they happen to be on different ranges of the same column.

This is much less efficient than a single index, and may not be useful for some queries, but it's possible.

The bigger problem is that PostgreSQL's partial index support is not very bright. Irrespective of whether there's one or two indexes it might just not figure out that it can use the index at all.

Demonstration setup:

CREATE TABLE partial (x integer, y integer);
CREATE INDEX xs_above_50 ON partial(x) WHERE (x > 50);
CREATE INDEX xs_below_50000 ON partial(x) WHERE (x < 5000);
INSERT INTO partial(x,y) SELECT a, a FROM generate_series(1,100000) a;

OK, what will Pg prefer for given queries?

regress=> EXPLAIN SELECT y FROM partial WHERE x > 50 AND x < 50000;
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Index Scan using xs_above_50 on partial  (cost=0.29..1788.47 rows=50309 width=4)
   Index Cond: ((x > 50) AND (x < 50000))
(2 rows)

regress=> EXPLAIN SELECT y FROM partial WHERE x > 20 AND x < 50000;
                          QUERY PLAN                          
--------------------------------------------------------------
 Seq Scan on partial  (cost=0.00..1943.00 rows=50339 width=4)
   Filter: ((x > 20) AND (x < 50000))
(2 rows)

regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 AND x < 50000;
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Index Scan using xs_above_50 on partial  (cost=0.29..1787.45 rows=50258 width=4)
   Index Cond: ((x > 100) AND (x < 50000))
(2 rows)

regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 AND x < 20000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using xs_above_50 on partial  (cost=0.29..710.71 rows=19921 width=4)
   Index Cond: ((x > 100) AND (x < 20000))
(2 rows)

What if we try to force a bitmap index scan just to find out if Pg can use one, even if it's not worth doing for this particular simple case and small sample?

Try:

regress=> SET enable_seqscan  = off;
SET
regress=> SET enable_indexscan  = off;
SET
regress=> SET enable_indexonlyscan  = off;
SET
regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 AND x < 20000;
                                   QUERY PLAN                                   
--------------------------------------------------------------------------------
 Bitmap Heap Scan on partial  (cost=424.48..1166.30 rows=19921 width=4)
   Recheck Cond: ((x > 100) AND (x < 20000))
   ->  Bitmap Index Scan on xs_above_50  (cost=0.00..419.50 rows=19921 width=0)
         Index Cond: ((x > 100) AND (x < 20000))
(4 rows)

Hm. Nope. Not combining the indexes there. It might be able to but simply not think it's worth scanning a second index, though.

What about a query that ORs two predicates instead?

regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 OR x < 200;
                                      QUERY PLAN                                       
---------------------------------------------------------------------------------------
 Bitmap Heap Scan on partial  (cost=1905.29..3848.29 rows=99908 width=4)
   Recheck Cond: ((x > 100) OR (x < 200))
   ->  BitmapOr  (cost=1905.29..1905.29 rows=100000 width=0)
         ->  Bitmap Index Scan on xs_above_50  (cost=0.00..1849.60 rows=99908 width=0)
               Index Cond: (x > 100)
         ->  Bitmap Index Scan on xs_below_50000  (cost=0.00..5.73 rows=193 width=0)
               Index Cond: (x < 200)
(7 rows)

Here PostgreSQL has ORed both indexes to find a match, then done a heap scan and recheck.

So yes, PostgreSQL can combine multiple partial indexes, at least for some queries, where it is useful to do so.

But if I RESET the planner overrides...

regress=> RESET enable_seqscan;
RESET
regress=> RESET enable_indexscan ;
RESET
regress=> RESET enable_indexonlyscan ;
RESET
regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 OR x < 200;
                          QUERY PLAN                          
--------------------------------------------------------------
 Seq Scan on partial  (cost=0.00..1943.00 rows=99908 width=4)
   Filter: ((x > 100) OR (x < 200))
(2 rows)

... Pg will realise it's faster to just sequentially scan the table.