Postgresql – Postgres database not using indexes

postgresql

I have a SQL optimization question.

Table Name: charts

Total Records: +250 million

Columns: timestamp(string), imported(boolean), ….

Indexes:

CREATE INDEX index_charts_on_timestamp ON public.charts 
USING btree ("timestamp") WHERE (imported = false)

CREATE UNIQUE INDEX charts_pkey ON public.charts USING btree (id)

QUERY EXPLAIN

SELECT  "charts"."id" FROM 
"charts" WHERE 
"charts"."imported" = FALSE AND 
"charts"."timestamp" = '2018-01-03' ORDER BY 
"charts"."id" ASC LIMIT 1000;

OUTPUT

Limit  (cost=1000.59..237865.48 rows=1000 width=8)
 ->  Gather Merge  (cost=1000.59..33262989.10 rows=140426 width=8)
     Workers Planned: 2
 ->  Parallel Index Scan using charts_pkey on charts_pkey  (cost=0.57..33245780.43 rows=58511 width=8)
    Filter: ((NOT imported) AND (("timestamp")::text = '2018-01-03'::text))
(5 rows)

How can i make postgres to use index_charts_on_timestamp instead?

I cannot present EXPLAIN ANALYZE results, the command simply freezes without any results.

Best Answer

I suggest to improve the index to:

(timestamp, id) WHERE (imported = false)

so it fits the query better, for both the WHERE and ORDER BY clauses.

The plan will only need an index seek and a partial index scan (until it finds the required 1000 rows).

The current index does not hold the id values, so the whole index has to be read and then the id values retrieved from the table and sorted to get the first 1000.

If the rows matching the WHERE conditions are not many (less than 1000 or less than 10k), then the sort operation is not very expensive but the values still have to be read as they are not in the index. If the matching rows are a lot more (say 100K or more), then both fetching and sorting them will be much more expensive.
It might be cheaper in these cases to do a full table scan. The optimizer decided in your case to do something similar: a parallel index scan of the PK index. This avoids the need of sorting but it still has to read the rows from the table until it finds 1000 rows that match the WHERE condition.

Both cases however would have no problem with the suggested index above, as the table would not have to be read at all and no sorting needed as the index order matches the ORDER BY.