We have a fairly simple setup to replicate from our on-premise master PostgreSQL database to our presentation tier in AWS. We're using WAL shipping using the archive_command
setting. Basically the setup looks like this:
+-------------+
| Master |
+-------------+
WAL |
segments |
\|/ +--------------+
+-------------+ WAL +-+------------+ |
| |----------->| Hot Standby | |
| S3 | segments | Slaves | |
| | | |-+
+-------------+ +--------------+
This setup seems to generally be fairly robust, but I haven't come up with a good way to detect failures, either the master failing to push archives up or a slave or slaves failing to retrieve the log files. What's a good way to determine if a slave is up-to-date with respect to master? What's a good way to determine if the master has failed to ship a WAL file?
Just to clarify, we are using the slaves strictly as read replicas, we will never failover to them.
Best Answer
Here are a few ways:
pg_current_xlog_location()
on the primary withpg_last_xlog_replay_location()
on the standby. That will give you the lag in bytes, which might not be very useful for alerting, but it can be useful to chart it.pg_last_xact_replay_timestamp()
on the standby against the current time.pg_last_xact_replay_timestamp()
became available.