I'm running pg_receivewal
locally, without compression, it stores segments on the same drive with the db files. Here's the replication statistic:
imbolc=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 12920
usesysid | 10
usename | postgres
application_name | pg_receivewal
client_addr |
client_hostname |
client_port | -1
backend_start | 2019-08-25 13:18:00.504531+07
backend_xmin |
state | streaming
sent_lsn | C/80009218
write_lsn | C/80009218
flush_lsn |
replay_lsn |
write_lag | 00:00:09.838314
flush_lag | 00:00:40.028949
replay_lag | 00:00:40.028949
sync_priority | 0
sync_state | async
If I set synchronous_commit = off
, write_lag
still remains significant, above 2.5 seconds.
So my questions are:
- why the
write_lag
is so high? - does it mean that in case of the db crash last 10 seconds of transactions can be lost?
- is there a way to improve it?
Best Answer
If the database has a "soft" crash, like power failure, it will go through autorecovery upon startup, and will recover all transactions (other than those possibly lost to synchronous_commit = off) using the log files it found in the pg_wal or pg_xlog directory. pg_receivewal is irrelevant to this situation.
If the database has a "hard" crash, such that the storage media is permanently lost and you have to recovery from a backup, then you will lose much more than 10 seconds of transactions. Since pg_receivewal is storing to the files in the same place as the db files, they will all be lost together. Running pg_receivewal on the same machine as the database and storing the files to the same drive is a pointless exercise, suitable only for testing purposes.
Other than that issue, you are misinterpreting what the field is for. It is not to measure the data at risk, it is "consistent with the goal of measuring synchronous commit and transaction visibility delays for recent write transactions."