You want to do an incremental backup of the archive folder to remote storage.
Should you need to restore from the backup, the basic scenario is that you'd need your base backup as the starting point, and the entire contents of the archive folder to replay the transactional activity that happened between the starting point and the crash.
Also to avoid having the files in the archive folder piling up forever, you want to do a new base backup from time to time and delete the files that were archived before the new base backup.
Short version: no. PostgreSQL doesn't support generating no WAL whatsoever, because changes to system catalog tables in the pg_catalog
schema are always logged. There's also transaction ID logging (pg_clog
), the multixact tracking data (pg_multixact
), etc, in addition to the xlog, but they tend to be very small.
As Daniel says, you can run with minimal WAL generation by:
- Using
wal_level = minimal
- Creating all tables as
UNLOGGED
or TEMPORARY
tables
In this case, you will also want to set fsync = off
, full_page_writes = off
and synchronous_commit = off
, since you have no requirement for durability and crash recovery.
Of course, if anything goes wrong (like power loss, unplanned restart, DB server crash, pg_ctl -m immediate stop
, etc) your data will be totally unrecoverable, but that seems to be what you want.
The rate of WAL creation and rotation with this configuration will be negligible for most applications.
If you don't mind having to re-initdb
after any restart (even a clean one) you could put pg_xlog
on a tempfs, but I doubt it's worth the hassle when Pg will generate only a tiny bit of WAL.
You should also check and make sure you don't have WAL archiving (archive_mode = on
) enabled, that wal_keep_segments
isn't set. Both of those are only useful if you're doing WAL-based backup or replication. If you don't, then WAL should not accumulate, it should be rapidly recycled. Maybe your checkpoints are set to be much too infrequent, so much so that you run out of disk? Check to see if the checkpoint_segments
parameter is really really high.
I think you're trying to solve the wrong problem here. The issue isn't stopping PostgreSQL from producing WAL, it's figuring out why it piled up in your batch jobs.
See also: Optimizing PostgreSQL for fast testing.
Best Answer
To address your first question, if you take a single basebackup of your database and just keep archiving away the WAL segments, the time to restore that database will grow longer and longer as the amount of WAL segments increases. That is not really an ideal situation to be in if you have downtime and need to bring a working database back up.
Most people figure out how much time they can spend restoring, and what recovery requirements they need to meet, and keep multiple basebackups and extra WAL segments as their storage budget and their recovery requirements allow.
Additionally, with very, very, very, very large WAL archives, sometimes things might happen to individual segments (which is yet another great reason to always test your backups!) and if a WAL segment fails to restore, no other segments following that failed segment will restore. At that point, you're out of luck.
One general rule I follow is to take a weekly basebackup, and keep 3 weeks of WAL segments and 2 basebackups, which allows me enough peace of mind to know I have one known-good backup, and a known-good set of WAL segments, and a reasonable restore time.
For your second question, you can have a hot standby server, and incremental backups, if you set it up that way. Your hot standby server only requires an initial basebackup from the primary, and a stream of WAL segments or WAL records (in the case of Streaming Replication, which 9.4 supports and is much nicer than simple WAL shipping).
I would recommend having a hot standby using Streaming Replication, and then taking your basebackups with pg_basebackup from the hot standby server, which will reduce the resource load on your primary considerably.
For an incremental (PITR) backup, you just need to have your primary server send the WAL segments it generates somewhere else (it's best that it's different hardware, of course) using an archive_command or pg_receivexlog.
Pushing the weekly basebackup you created with
pg_basebackup
there as well will make keeping track of all the necessary components for incremental restoration much easier. Plus, having everything together makes testing those backups trivial.As for the third question, I don't know of a PITR manager that runs on Windows. In theory, since it's just Python, it could run on Windows, but as the FAQ says, it hasn't been tested.
Magnus Hagander has a great talk on backups from PgCon 2013 that you might find helpful: PostgreSQL Backup Strategies