If you have archiving of WAL files already on, there should be no need to restart the primary db server.. If WAL archiving is NOT already turned on, then you might have to restart the server to have the configuration change be active. (Looking at the documentation, I believe issuing a SELECT PG_RELOAD_CONF();
is good enough for archiving .. but the postgresql.conf file may specifically say "changing this setting requires a restart")
The only other requirement is that you issue a pg_start_backup('label')
and pg_stop_backup()
(neither of which require a db service interruption).
The PITR backup will be valid STARTING from the point of your backup.. (As in, you will not be able to recover to a point BEFORE you issue pg_start_backup
unless you have another backup from earlier). Start the backup .. perform a file-level backup of all db files .. stop the backup .. perform a file-level backup of the pg_xlog files ... you're done (with the backup process).
To restore, you need to restore all the files you just backed up (plus any additional WAL files that were generated since you ran the backup) and create a recovery.conf file before starting the server.. I won't go through all the details, as you are obviously already looking through the documentation on it. :)
There are several ways of doing what you're wanting.
I would recommend using a specific solution such as WAL-E, barman, or OmniPITR to make the logistics of managing basebackups and WAL archives easier on yourself, or using pg_basebackup. Some of these tools, depending on the versions, will allow you to back up off your replica, which will reduce load on your primary server, and not push data out of the filesystem cache, which are all very good things.
And as you expected, there are definitely pros and cons with your approaches.
For your first question, you have to consider that if you take a single basebackup and just keep archiving WAL segments, the time to restore will grow longer with each WAL segment you archive, as it takes a significant amount of time restore each file.
So that is not ideal, and most people tend to take a backup on a schedule that allows them to meet recovery requirements, and sometimes keep multiple basebackups as their storage allotments allow.
Additionally, with a very, very, very large WAL archive, sometimes things can happen with individual WAL segments (which is why you always test your backups), and if a WAL segment fails to restore, no following segments will restore, and you're out of luck.
A general rule I've heard and use myself is if I take weekly basebackups, keeping around 3 weeks of WAL segments and 2 basebackups allows me enough peace of mind to have at least one known-good backup, and a known-good set of WAL, and a fairly reasonable restore time. I am a huge believer in testing the backups I take regularly.
For your second question, either one works well, though, pg_basebackup is easier to use and requires less of your own assembly. If you use rsync in your own script with start and stop basebackup, you have to check the exit code of every command. If you don't and a command fails, you can end up with a corrupt basebackup that's useless. The above mentioned tools, along with pg_basebackup, take care of all of those pedantic details for you.
For your third question, in some cases, yes, you can make a backup off of a replica, instead of the primary. If you can manage it, doing the basebackup and archiving off of a replica is the way to go, because taking a basebackup generates a lot of I/O, and it affects your OS filesystem cache. Unfortunately, it is not supported until 9.2+ with pg_basebackup, so you'll need to use an external tool such as OmniPITR or to upgrade your version of PostgreSQL. So long as you have the WAL to go with it, there shouldn't be any timeline issues or consistency issues with the snapshot from the replica.
Finally, a lot of your questions and best practices are covered in the following talk from Magnus Hagander at PgCon 2013: PostgreSQL Backup Strategies
It is well worth the time spent to watch the video.
Hope that helps. =)
Best Answer
You can run
pg_basebackup
against the slave server just as well as against the master.