PostgreSQL Slave has more files in pg_xlog than /wal_archive

postgresqlpostgresql-9.3replication

Background: for various reasons I am switching PostgreSQL Slave from a Streaming replication to a warm standby replication.

The streaming replication was setup using this guide: http://technology.trapeze.com/journal/postgresql-streaming-replication-ubuntu-1004/

To make the switch to a warm standby: I am following instructions for warm standby here (1) https://wiki.postgresql.org/wiki/Warm_Standby and here (2) http://www.postgresql.org/docs/current/static/pgstandby.html

I am finding that on the slave there are more log files in pg_xlog/ than in wal_archive/ (files sent from Master). How is that possible? I think it's hindering my warm setup.

  • in /var/lib/postgresql/9.3/main/pg_xlog/ of Master I have files from 000000010000000000000001 to 00000001000000000000001A.backup to 00000001000000000000001C
  • in /var/lib/postgresql/wal_archive/ of Slave I have files from 000000010000000000000001 to 00000001000000000000001A.backup to 00000001000000000000001B
  • in /var/lib/postgresql/9.3/main/pg_xlog/ of Slave I have files from 00000001000000000000001C to 000000010000000000000023

recovery.conf contains:

restore_command = 'pg_standby -d -s 2 -t /tmp/pgsql.trigger.5442 [/wal_archive] %f %p %r 2>>standby.log'

when I restart the Slave it complains that it is missing the xxx1C log file

First question, how is it possible for the Slave to have more pg_xlog/ log files than the Master?

Second question, to resync the servers in warm standby mode: do I have to do pg_basebackup again (to essentially copy Master's /data and /pg_xlog directory) to the Slave?

Just trying to understand PostgreSQL 🙂 Thanks in advance!!

I found a very similar question here, that seems to have never been answered: http://www.postgresql.org/message-id/CAE7ByhgTJ6xOe4f3aa42Kn0_5gfEzWW1FZ9QQqsC6RP-Fnh_ow@mail.gmail.com

Best Answer

how is it possible for the Slave to have more pg_xlog/ log files than the Master?

The whole point of archiving WAL on the master to some external location is to let the master then delete it to free space in its pg_xlog, while replicas might still need it.

A replica can have more archives in pg_xlog than the master, and older ones, if it's lagging behind the master due to failure to keep up with replay. However, with pg_standby that shouldn't happen - the archive might contain more xlogs, but the replica should only be reading them on-demand.

It's hard to be specific, because you've given a broad description of the issue rather than actual directory listings, and haven't explained the exact steps you followed to set up the replica. Or shown the exact log file output from the replica. So the best I can do is "it sounds like the replica setup is broken somehw".

to resync the servers in warm standby mode: do I have to do pg_basebackup again (to essentially copy Master's /data and /pg_xlog directory) to the Slave?

Assuming that here /data is the main datadir, containing global, base, pg_clog, etc, and that pg_xlog is the transaction logs from a different disk: Yes, that's right.

You must use the pg_basebackup command, though, or follow the instructions in the manual for correct file system level copies using pg_start_backup() and rsync/cp.

You also have to make sure you've stopped the replica first. Overwriting its datadir while it's running will make it quite upset.


Streaming replication vs warm standby

Hot vs warm standby is orthogonal to streaming vs log shipping replication.

What you're trying to do is use log shipping instead of streaming replication. It doesn't matter for this purpose if the replica is a hot standby or a warm standby, i.e. whether or not it's accepting queries.

Personally I recommend using both methods - use streaming, and fall back to log shipping if there's a problem with streaming. PostgreSQL does this automatically if both are configured.