Postgresql Physical Replication vs Logical Replication

postgresqlreplication

I have reading about difference between logical replication slots and phyiscal replication slots in Postgres. I am confused about some paramters and how they are used.

  1. For a replication slot, we have two parameters wal_sender_timeout and wal_keep_segments. For first one postgres disconnects a replication slot if replication connection is idle for more than wal_sender_timeout. Suppose, we have a logical replication slot and we have wal_keep_segments set to 0, all the older wal logs will be deleted from the server, so if the replication server disconnects and try to get wal logs again, will it be able to recover/start the replication again? Will the same thing will happen in physical replication slot as well or wal_sender_timeout is application only for logical replication slots?

  2. I have hosted a CloudSQL(Google Cloud managed) database instance and I have created a read-replica which uses a physical replication slot. Parameters which are set over there are wal_sender_timeout=60000 in ms and wal_keep_segments=0. So, I expect read-replica to not able to recover if it loses connection to master server for some time. But I think I am missing something here, since, it has happened that there was network outage for some time but still replication was happening normally after the network outage.

Can some one clear it out. I am reading online resources but able to distinguish when people are talking about logical replication and when about physical replication.

I know I can use continuous archiving to handle all the above problems but I just want to understand it from these parameters perspective.

Best Answer

WAL is retained until both all valid slots are satisfied and wal_keep_segments/wal_keep_size is satisfied. If all of your replication needs use slots, then wal_keep_size serves no purpose and should be zero.

If the wal receiver becomes disconnected, the WAL is still maintained (unless max_slot_wal_keep_size is set and is exceeded) waiting for it to reconnect.

For physical replication, use of slots is optional. If you don't use slots, then you can instead use wal_keep_segments/wal_keep_size to give replicas a chance to catch up if they ever fall behind. If they fail to keep up within that amount of WAL, then they will be cut lose and will have to be reestablished from scratch to get them back online. The specified amount of WAL is always retained, even if no replicas actually need it.

For logical replication, use of slots is mandatory. For that reason, wal_keep_segments serves no purpose.

On the newest version (v13) max_slot_wal_keep_size was implemented, this allows a logical replica to be cut loose once it falls behind by a certain amount. With this system, the max amount of WAL is retained only if something actually needs it. Without this setting and without careful monitoring, the master might fill up its storage and stop working if a replica should be permanently disconnected or just fall behind by an unbounded amount.