Postgresql – Temporarily disable indexes before updating whole table

amazon-rdsindexpg-restorepostgresqlupdate

For testing purposes I periodically restore production backup into a new database (AWS RDS PostgreSQL instance) with pg_restore. In the new database I run SQL queries that anonymize sensitive data. However, a query like below updates over 20M records in a table that has ~10 indexes (created during pg_restore).

UPDATE users SET email = users.id || '@example.com',
phone_number = NULL;

This takes a few hours on current setup.

To speed things up, I wanted to disable the table's indexes just for running the update and then enable them again and reindex the table. How can I do this? Any other ways to speed up this UPDATE?

Notes:

  1. I've tried

    UPDATE pg_index
    SET indisready=false
    WHERE indrelid = (
        SELECT oid
        FROM pg_class
        WHERE relname='users'
    );
    

but got the error below.

ERROR:  permission denied for relation pg_index

Apparently you can't do this as superuser in AWS RDS PostgreSQL instance.

  1. Dropping the indexes and then creating them again wouldn't be convenient, because I would have to know what columns they cover in advance. Ideally I would have a command that store existing index configuration in some temporary variable and then recreate them again using this info.

Best Answer

When you restore the data, use pg_restore --section=pre-data --section=data. This will leave off the indexes.

Then run the UPDATEs you want to, then run pg_restore --section=post-data