PostgreSQL – How to Alter the WHERE Clause of an Index

indexperformancepostgresql

I need to insert a new two-field index on an existing table with about 600 million rows in it. The table contains data accumulated over about a year with approximately 2 million new rows added per day.

I know adding a new index to one very large table in one go will take ages, so to save time I started by only indexing the last few days worth of data using the following command…

CREATE INDEX  idx_001
ON message (deviceNumber, messageTime)
WHERE messageTime >='07-May-2018 00:00:00' ;

This took about 3 hours, or about 40 minutes to index one day.

Ideally I’d like to index the whole table, but at 40 minutes for a days data doing it on one go will take over a week.

I guess I could create lots of small indexes, each for a few days of data, maybe something like this…

CREATE INDEX  idx_002
ON message (deviceNumber, messageTime)
WHERE messageTime <'07-May-2018 00:00:00' AND messageTime >='01-May-2018 00:00:00'  ;

But I’d prefer to stick with one index. Is there any way of ALTERing the where clause of the existing index to increase the days covered? Or do I have to create lots of small indexes?

Best Answer

No, you cannot alter an index. Because, even if PostgreSQL allowed this, it would need to rebuild the index anyway. Unless it implemented very smart checks on the WHERE clause, which would serve quite rare use cases.

Having many indexes is also not optimal. Every time you run a query, the planner has to consider all existing indexes and pick up the best one. If you have several indexes, the planner takes more time. Not ages - but still, this delay affects each and every query involving that table.

My suggestion is... be patient and build a big index. You can use CREATE INDEX CONCURRENTLY to avoid locks.