PostgreSQL Replication – How xlogs Files Travel from Master to Slave

high-availabilitylog-shippingpostgresqlreplication

I have some doubts about how the workflow proccess in PostgreSQL replication using archiving logs.

If I have this configuration in Master postgresql.conf:

archive_command = 'test ! -f /path/to/archive/%f && cp %p /path/to/archive/%f'

After the %p and %f parameters have been replaced, the actual command executed might look like this:

test ! -f /path/to/archive/00000001000000A900000065 && cp pg_xlog/00000001000000A900000065 /path/to/archive/00000001000000A900000065

Ok, if the file isn't into archive, then it will be copied.

If I have this configuration in Slave recovery.conf:

restore_command = 'cp /path/to/archive/%f %p'

Then if I don't configure anything like this :

scp /master/path/to/archive/* /slave/path/to/archive/

How is possible that the xlogs files in /path/to/archive/ travels from Master to Slave, and then the Slave copy them to its pg_xlog/ directory?

Best Answer

You're expected to ensure that /path/to/archive is a shared volume, like an NFS fileshare, that both servers can read. Or you're expected to use an appropriate command to copy the files to/from a shared storage location.

I think this might need to be made more obvious in the documentation.

Re your comment:

Why it is necessary to create a archive directory and the copy doesn't execute directly into pg_xlog?

the reason is that the master's pg_xlog is generally on a different server to the replica(s). The replicas cannot access it. You really don't want to put pg_xlog on shared storage because it's performance critical.

It also does you no good to have a backup that requires access to the master server's pg_xlog when the master server just melted in a fire.

By archiving WALs, PostgreSQL can also reduce filesystem fragmentation. It recycles WAL archives once they're no longer required, renaming them and using them as if they were new files. This is a significant performance improvement that would not be possible if it had to keep all the old WAL files around.

Finally, it creates a logical separation between "transaction logs the master server still needs to operate" and "transaction logs that are now only required for backup/archival".