PostgreSQL – Is REINDEX Required After CLUSTER?

postgresql

I'm considering using CLUSTER to reorder a table by an index. I understand that this recreation of the table data makes all the existing indexes either bloat or be useless. I've seen some indications that a REINDEX is required after a CLUSTER. I've found other references that indicate that CLUSTER does a REINDEX. The Official Documentation says nothing at all about REINDEX being part of CLUSTER or required (Although it does suggest running ANALYZE after the CLUSTER)

Can anyone definitively (i.e. with some sort of reference to official docs) say whether or not a REINDEX is required after a CLUSTER?

Best Answer

You do not need to reindex, because CLUSTER effectively does it for you.

More specifically, CLUSTER locks the source table then creates a new copy of it ordered according to the target index. It creates indexes on the new copy then replaces the old table and indexes with the new ones.

Note that this is also true of VACUUM FULL in 9.0+.

If you've been seeing discussion suggesting that CLUSTER bloats indexes it could be people who're assuming that CLUSTER works like pre-9.0 VACUUM FULL. You might also be seeing and misreading discussions that mention index bloat caused by the old VACUUM FULL implementation and suggesting CLUSTER as an alternative.

This is implied in the documentation:

a temporary copy of the table is created that contains the table data in the index order. Temporary copies of each index on the table are created as well. Therefore, you need free space on disk at least equal to the sum of the table size and the index sizes

What it doesn't say, but should, is that those temporary copies then replace the original table. (Bold mine).

Best Answer

Related Solutions

PostgreSQL 7.4 data corruption

PostgreSQL maintenance

Related Question