Postgresql – How to estimate reindexing time

indexpostgresql

How do you estimate the total time reindexing an entire table will take in PostgreSQL?

I have a large PostgreSQL database, the primary table of which has 15 million rows with 100 columns and 200 indexes.

I'm seeing a problem where, after a night of processes loading data into the table, queries to the table are terribly slow for about 3-6 hours after the loading ends. However, these queries miraculously become fast again after 6 hours post-load.

Is this temporary slowness caused by Postgres updating indexes in the background? How do I speed up this processing? I can't find any parameters for controlling this.

I was thinking about running a REINDEX on the table immediately after loading ends, but I'm not sure how long this will take, and I'm worried it might take longer than the original load and will only worsen performance.

My table stores dozens of descriptors about a customer, and the indexes (many of them partial indexes) are for speeding up very routine queries run by users.

The database is nearly a terabyte, so I rarely load the entire thing to a dev environment. Instead, I have a script that copies a few thousand records.

Best Answer

As you have 200 index, it would take longer than a vacuum. A reindex always depends on your OI. It seemed that it would take more than an hour.

What I would suggest is:

  1. get rid of some indexes. 200 for 100 columns makes no sense. See how many of the l those you use 90% of the time.

    1.1. The other queries would find their path along other indexes

    1.2. Your record takes more time recording indexes than rows, I that should be slowing down your night update

  2. You could reindex the indexes the you use the most

    2.1. Use REINDEX INDEX index_name;

    2.2. If you can run some reindex in parallel, it would make better use of your server cores