Postgresql – query doesnt use column dedicated index after a certain amount of rows

indexindex-tuningpostgresql

I have the following table:

create schema test;
CREATE TABLE test.foo (
    info_datetime timestamp NOT NULL,
    asset_id int4 NOT NULL,
    price float8 NULL,
    CONSTRAINT minute_prices_pkey PRIMARY KEY (info_datetime, asset_id)
);
CREATE INDEX foo_asset_id_idx ON test.foo USING btree (asset_id);

when i run this with a small amount of rows the query uses the dedicated index:

explain
select *
from test.foo mp
where mp.asset_id = 1

Bitmap Heap Scan on foo mp  (cost=4.21..14.37 rows=8 width=20) (actual time=0.005..0.006 rows=0 loops=1)
  Recheck Cond: (asset_id = 1)
  Buffers: shared hit=1
  ->  Bitmap Index Scan on foo_asset_id_idx  (cost=0.00..4.21 rows=8 width=0) (actual time=0.003..0.003 rows=0 loops=1)
        Index Cond: (asset_id = 1)
        Buffers: shared hit=1
Planning Time: 0.077 ms
Execution Time: 0.026 ms

however when I insert about 3000 rows, the same query doesn't use the index

Seq Scan on foo mp  (cost=0.00..42.56 rows=1332 width=20) (actual time=0.012..0.435 rows=1332 loops=1)
  Filter: (asset_id = 1)
  Rows Removed by Filter: 873
  Buffers: shared hit=15
Planning Time: 0.074 ms
Execution Time: 0.707 ms

Best Answer

In this plan:

Seq Scan on foo mp  (cost=0.00..42.56 rows=1332 width=20) (actual time=0.012..0.435 rows=1332 loops=1)
  Filter: (asset_id = 1)
  Rows Removed by Filter: 873
  Buffers: shared hit=15
Planning Time: 0.074 ms
Execution Time: 0.707 ms

You can see that roughly 60% (1332 of 2205) of the rows from that table are returned. Doing that with a Seq Scan is much more efficient that doing an index lookup for each of those rows.

In the first query Postgres only expected 8 rows to be returned (in reality no row was returned) and for that it is more efficient to probe the index.

See this related answer