PostgreSQL not using index during count(*)

index-tuningperformancepostgresqlpostgresql-9.3postgresql-performance

I have a COUNT(*) query in PostgreSQL that runs often, and looks like:

SELECT COUNT(*) 
  FROM customer 
 WHERE source_id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);

This query takes between 30-60 seconds to run and searches millions of records.

EXPLAIN ANALYZE shows it's doing a sequential scan, so I created the index:

CREATE INDEX customer_by_source ON customer (source_id)
WHERE source_id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);

However, EXPLAIN ANALYZE still shows it's doing a sequential scan and isn't using the index.

How do I speed up this query and make it use the index?

Edit: My Postgres version is 9.3.3. The table has about 20 million records, divided pretty evenly among each source_id, of which another 5 aren't included in the list.

Best Answer

If I interpret your added information correctly, you have 21 distinct source_id with roughly 1 million rows each (divided pretty evenly among each source_id). That means your query counts roughly 3/4 of the whole table. An index typically can't buy much either way.

Test with EXPLAIN (just EXPLAIN is enough for this purpose) to see the estimated cost for each variant:

EXPLAIN SELECT .... -- your query

And take note of all cost= numbers in the output.

Then (only in your session):

SET enable_seqscan = off;

And repeat the procedure. Once more after:

SET enable_indexscan = off;

This should tell you why Postgres chooses a sequential scan. Typically, if your index is substantially smaller than the table (index entries substantially smaller than table rows and / or substantially fewer rows) and if conditions for an index-only scan are met, Postgres will choose that route - unless your cost settings are seriously out of touch with the reality of your setup.

Most importantly, the visibility map has to show that whole pages are visible to all transactions. VACUUM updates the visibility map after writes. Read details on the linked Wiki page.

You can try your query once more immediately after running VACUUM ANALYZE customer;.

If index-only scans are not possible, a sequential scan will most probably be faster than a bitmap index scan, meaning that your index would be of no use.

Update for Postgres 9.6

There was an important restriction for partial indexes, improved in Postgres 9.6. The release notes:

  • Allow use of an index-only scan on a partial index when the index's WHERE clause references columns that are not indexed (Tomas Vondra, Kyotaro Horiguchi)

For example, an index defined by CREATE INDEX tidx_partial ON t(b) WHERE a > 0 can now be used for an index-only scan by a query that specifies WHERE a > 0 and does not otherwise use a. Previously this was disallowed because a is not listed as an index column.