PostgreSQL – Multi-Expression Indexes Not Being Used as Expected

execution-planindexindex-statisticspostgresqlstatistics

I have a PostgreSQL 9.3 table with multiple expression indexes defined. When I use explain to see which indexes are being used for my query, I am surprised to see that PostgreSQL is not using the index which most closely-matches the conditions in my WHERE clause. It appears to be a problem specific to expression indexes. Here is a sqlfiddle link which demonstrates the problem:

http://sqlfiddle.com/#!15/580e0/1

CREATE TABLE Table1
    ("a" int, "b" int, "c" int, "d" int)
;
CREATE index a ON Table1 (a);
CREATE index ab ON Table1 (a,b);
CREATE index bplus1 ON Table1 ((b+1));
CREATE index abplus1 ON Table1 ((a+1),(b+1));

The basic multi-column index ab is being used properly:

 explain select * from Table1 where a = 1 AND b = 1

    |                                                       QUERY PLAN |
    |------------------------------------------------------------------|
    | Index Scan using ab on table1  (cost=0.13..8.15 rows=1 width=16) |
    |                                Index Cond: ((a = 1) AND (b = 1)) |

However the similar multi-column expression index abplus1 isn't being used:

explain select * from Table1 where (a+1) = 2 AND (b+1) = 2


|                                                           QUERY PLAN |
|----------------------------------------------------------------------|
| Index Scan using bplus1 on table1  (cost=0.13..8.16 rows=1 width=16) |
|                                            Index Cond: ((b + 1) = 2) |
|                                                Filter: ((a + 1) = 2) |

The strange thing is that the more-specific index does get used if the single-column expression indexes are removed. See this example http://sqlfiddle.com/#!15/f4fda/1:

explain select * from Table1 where (a+1) = 2 AND (b+1) = 2


|                                                            QUERY PLAN |
|-----------------------------------------------------------------------|
| Index Scan using abplus1 on table1  (cost=0.14..8.15 rows=1 width=16) |
|                         Index Cond: (((a + 1) = 2) AND ((b + 1) = 2)) |

It looks to me like there is a conflict between the single-expression and the multi-expression indexes. The surprising thing is that there doesn't appear to be any such conflict when using basic (non-expression) indexes.

Is there anything that I can do to work around this problem? I would like to keep both of my expression indexes, as each one is well-suited for particular queries.

Best Answer

It looks to me like there is a conflict between the single-expression and the multi-expression indexes.

All applicable indexes are candidates, if that's what you mean by "conflict". But that's the extent of it. Postgres estimates execution time based on table and index statistics in combination with cost settings, and the best estimate wins.

If Postgres expects one of your two conditions to be selective enough, it will likely take a matching single-column index and filter possible (few!) false positives in a FILTER step with the second condition.

However, that is certainly not the case for the data distribution you mention in the comment:

20,000 rows. b=1 for 10,000, a=1 for 2. One row for a=1,b=1

@ypercube already demonstrated in his sqlfiddle that your test should use the multicolumn index. Even more so with the data distribution you describe.

If you provide the output of EXPLAIN (ANALYZE, BUFFERS) instead of just EXPLAIN, we might be able to see more. I would guess you didn't run ANALYZE after creating the expression index, which is required to update statistics. autovacuum will kick in after some time, but not fast enough if you run your test immediately after creating the index (and never for temporary tables!).

ANALYZE Table1;

Come to think of it, we (that is: you & me) solved this problem before:

Also related:

That aside, Postgres 9.3 is getting old. There have been 3 major release since. If you run into problems like this I suggest to test a current version and see if the problem is still there.