Postgresql – Why isn’t Postgres doing an index-only scan

indexperformancepostgresqlquery-performance

I am using Postgres 9.4. This is my table:

 id                | integer                 | not null default nextval('frontend_prescription_id_seq'::regclass)
 presentation_code | character varying(15)   | not null
 total_items       | integer                 | not null
 actual_cost       | double precision        | not null
 quantity          | double precision        | not null
 processing_date   | date                    | not null
 pct_id            | character varying(3)    | not null
Indexes:
    "frontend_prescription_pkey" PRIMARY KEY, btree (id)
    "frontend_prescription_5283da53" btree (pct_id)
    "frontend_prescription_528f368c" btree (processing_date)
    "frontend_prescription_cost_by_pres_code" btree (presentation_code, pct_id, actual_cost)

There are about 450m rows in the table. I have just run VACUUM ANALYZE, so the query planner should be up to date.

Here is my query:

EXPLAIN (ANALYZE, BUFFERS)
  SELECT SUM(actual_cost) AS cost, 
    pct_id AS id, processing_date 
  FROM frontend_prescription 
  WHERE presentation_code='0501013B0AAAAAA' 
  GROUP BY pct_id, processing_date;

And here is the output:

 HashAggregate  (cost=1537066.69..1537066.87 rows=18 width=16) (actual time=189604.446..189608.532 rows=9725 loops=1)
   Group Key: pct_id, processing_date
   Buffers: shared hit=17 read=418783
   ->  Bitmap Heap Scan on frontend_prescription  (cost=14473.73..1533678.60 rows=451746 width=16) (actual time=543.825..187857.510 rows=416328 loops=1)
         Recheck Cond: ((presentation_code)::text = '0501013B0AAAAAA'::text)
         Heap Blocks: exact=416287
         Buffers: shared hit=17 read=418783
         ->  Bitmap Index Scan on frontend_prescription_cost_by_pres_code  (cost=0.00..14360.79 rows=451746 width=0) (actual time=317.102..317.102 rows=416328 loops=1)
               Index Cond: ((presentation_code)::text = '0501013B0AAAAAA'::text)
               Buffers: shared hit=2 read=2511
 Planning time: 0.329 ms
 Execution time: 189610.440 ms

Why does Postgres need to do a heap scan? Isn't all the information needed for the query in the composite frontend_prescription_cost_by_pres_code index?

Readable version of the output: http://explain.depesz.com/s/oV4

Best Answer

processing_date is not in the index, so it cannot use an index only scan.

Create the index on (presentation_code, pct_id, processing_date, actual_cost) so that it can use an index only scan, and in that order so that it can efficiently use the sort order of the index to do the grouping.