PostgreSQL – WAL Files Not Deleting During PITR

archivebackupdisk-spacepostgresqlrecovery

I have got an issue with setting up PITR on my postgres server:
WAL archiving and restoring works quite well, as well as restoring to a specific point in time. The issue however is, that if i do a long replay all the WAL files are extracted from the archive and start piling up in the pg_xlog directory without ever being deleted, this causes the playback to stop with an error once the disk is full. If i then delete all the logs and restart the server the replay continues until the disk is full again. Repeat this procedure a few times and the recovery is completed sucessfully.
(the actual working time of the server is just a few minutes)

Since this is already a pain after only a few weeks of history i cannot take a system like this online as the recovery process would take forever…

I'd appreciate any help you can provide.

Ps.: wal_level is set to minimal and archiving is off during recovery

Best Answer

You probably have checkpoint_timeout set very high on the production system, and archive switches are being driven by checkpoint_segments, probably in conjunction with a low setting of archive_timeout.

But, checkpoint_segments has no effect during recovery. Restartpoints during recovery are driven exclusively by checkpoint_timeout. Since recovery can replay files far faster than it took production to create them, that means you have a huge accumulation of old xlog files waiting for a restartpoint to allow them to be cleared out.

One solution is to lower checkpoint_timeout by quite a bit for the system undergoing recovery, and then set it back once recovery has finished. This will cause restartpoints to happen more often, allowing the log files to be removed sooner.

Another option is to simply have a script delete the files from pg_xlog during recovery once a newer file has been restored and renamed. If the system crashes during recovery, it may then have to go refetch some of those deleted files from the archive.

The system was designed to tolerate the case where files are removed from the archive once they are fetched and so cannot be re-fetched. That is why it stores all the fetched files in pg_xlog until a restartpoint; so it has all the data it needs to restart the recovery if the recovery crashes. As long as you don't remove files from the archive, you can remove them from pg_xlog.

This situation will thankfully be improved in 9.5.