Postgresql 9.1 PITR file system base backup – do I need to stop database

migrationpostgresql

I am preparing for a migration of a big database to another server and plan to use WAL archiving and Point-In-Time-Recovery to minimize downtime. After reading the documentation I learned I have to do full filesystem-level base backup since the point of WAL archiving started.

In http://www.postgresql.org/docs/9.1/static/backup-file.html, it is stated that:

The database server must be shut down in order to get a usable backup.
Half-way measures such as disallowing all connections will not work
(in part because tar and similar tools do not take an atomic snapshot
of the state of the file system, but also because of internal
buffering within the server). Information about stopping the server
can be found in Section 17.5. Needless to say, you also need to shut
down the server before restoring the data.

Next, in the PITR section it is said the source server does not have to be shutdown if pg_start_backup() was run before.

http://www.postgresql.org/docs/9.1/static/continuous-archiving.html

Perform the backup, using any convenient file-system-backup tool such
as tar or cpio (not pg_dump or pg_dumpall). It is neither necessary
nor desirable to stop normal operation of the database while you do
this.

Also this caught my eye:

Some file system backup tools emit warnings or errors if the files
they are trying to copy change while the copy proceeds. When taking a
base backup of an active database, this situation is normal and not an
error

If the database binary files are changing while performing the backup and this is allowed, how is the file format consistency guaranteed and can I be sure I won't end up with corrupted files?

So my question is, do I or do I not need to stop the source server when performing the full file system backup if steps in PITR section of documentation are followed? And how is it possible the base backup is correct if the files are allowed to change during it? Don't I end up with partial overwrites or something like that? My guess would be that any such damage can be repaired by restoring WAL files? Is this correct?

I am probably just asking what was already said in the documentation, but I need to be sure and I am a bit jumpy about this migration process.

Thank you.

Best Answer

If you have archiving of WAL files already on, there should be no need to restart the primary db server.. If WAL archiving is NOT already turned on, then you might have to restart the server to have the configuration change be active. (Looking at the documentation, I believe issuing a SELECT PG_RELOAD_CONF(); is good enough for archiving .. but the postgresql.conf file may specifically say "changing this setting requires a restart")

The only other requirement is that you issue a pg_start_backup('label') and pg_stop_backup() (neither of which require a db service interruption).

The PITR backup will be valid STARTING from the point of your backup.. (As in, you will not be able to recover to a point BEFORE you issue pg_start_backup unless you have another backup from earlier). Start the backup .. perform a file-level backup of all db files .. stop the backup .. perform a file-level backup of the pg_xlog files ... you're done (with the backup process).

To restore, you need to restore all the files you just backed up (plus any additional WAL files that were generated since you ran the backup) and create a recovery.conf file before starting the server.. I won't go through all the details, as you are obviously already looking through the documentation on it. :)