Postgresql – archive_cleanup_command with multiple standby servers

postgresqlreplication

Regarding the archive_cleanup_command command, the Postgresql 9.6 documentation states the following:

Note however that if multiple standby servers are restoring from the
same archive directory, you will need to ensure that you do not delete
WAL files until they are no longer needed by any of the servers.

Is there a built-in way that the master can know when all of the slaves do not need particular WAL files? Or does that functionality need to be added via scripting or other mechanism? If so, what would that look like?

Also, would it be typical to have an archive_cleanup_command on both the master and the slaves, or would the slaves even be archiving?

Best Answer

The built-in way to do this is to use replication slots and to do away with a WAL archive altogether. Replication slots will cause the WAL to be retained in pg_xlog until all standbys have received it, rather than stored in the WAL archive directory. So make sure you have enough space available for this purpose, i.e. assign the space currently set aside for WAL archive to be added to the available space for pg_xlog.

An alternative is to use "wal_keep_segments" rather than replication slots, and define it at a little less than how much space you have reserved for pg_xlog. Any standbys which fall behind by more than this amount will be sacrificed, rather than being allowed to bring down the master.

For your other question, archive_cleanup_command cannot run on the master (it is set in recovery.conf, which the master doesn't have), nor would it make any sense for it to do so.