I need your help about how to reduce time to restore process for a big dump (400GB)
We have a database that takes weeks to restore due to the single threaded nature of index generation on PostgreSQL. Please i need your help to figure out a way that we can restore this quicker, maybe by disabling index generation at load time and adding the indexes back later or some better trick.
sample pg_dump command:
pg_dump --compress=0 -bo -F c --lock-wait-timeout=1500 -h $HOST -p $PORT $DBNAME | lbzip2 > $DB-$TIMESTAMP.bz2
sample pg_restore command:
pg_restore -Ov -j 2 -h $HOST -p $PORT --dbname=$DBNAME $RESTOREFILE
-j
option does not help us since it helps the part that doesn't take long (backup restore) but not the part that does take a long time (index generation) – this ticket is to figure out a way to streamline the process so index generation is done separately on an already working database or to speed up index generation, on PG 9.2
I'd like a clear procedure for removing index generation from the restoration process and doing it afterwards, to not block usage of the DB.
Is possible do that? what is your opinion about that?
Best Answer
pg_restore has an option to run the time-consuming parts of the restore, such as the index rebuild process, with multiple "jobs".
From the pg_restore documentation for PostgreSQL 9.2: