PostgreSQL – Running pg_dump on a Hot Standby Server

backuppostgresqlreplication

Disclaimer: I admittedly haven't tried this yet, but I'm not sure I would know if it wasn't working correctly, so I wanted to ask.

I would like to run a nightly backup job (via pg_dumpall) from a hot standby server running streaming replication, to avoid putting that load on the primary. I've only seen mention of some gotchas people have run into, e.g. here and here, but very little guidance. It's okay if the backup lags behind the primary slightly, as long as it's consistent (which it should be).

My questions are:

  1. Do I really want to do this, or should the backup be done on the primary server? Why?

  2. When doing a dump on the standby, what settings do I need and procedure should I use to do that correctly? e.g. must I stop replication for the duration of the backup?

Best Answer

AFAIK, running pg_dump on a hot standby is one of the major things standbys are useful for. It's perfectly safe, though it isn't perfectly reliable - the dumps can fail if the standby aborts the transaction when it's falling too far behind the master.

The only thing you really need to watch is to make sure the standby is current and is keeping up. If the standby lost its connection to master and fell too far behind, you don't want to be merrily backing up a three week out-of-date standby.

You will need to allow the standby to fall quite far behind the master during the backup, since it'll otherwise have to cancel your pg_dump transaction in order to continue replaying WAL. See the documentation on hot standby, particularly the "handling query conflicts" section, and the max_standby_archive_delay and max_standby_streaming_delay parameters.

Note that the master must be willing to keep enough WAL archives to allow the slave to catch up again.