Postgresql – Could not start WAL streaming, replication slot is active

aws-aurorapostgresqlreplication

I want to set up replication between two of my databases.

One is an Aurora database in PG 10 the publisher, and the other one is an RDS database in PG 10.

But I'm facing an issue, I have 500 GB of data to transfer, so I'm adding table one by one to the replication and waiting for the status to be ready before adding another one.

But after a while, I've got this error: ERROR: terminating logical replication worker due to timeout on the subscriber side. And the worker cannot restart because when it tries to restart the following error appears on the logs: ERROR: could not start WAL streaming: ERROR: replication slot "xxx" is active for PID 25860

After that, the WAL file keeps increasing in the publisher database, and the LSN does not move anymore.

The command I made were quite simple:

CREATE PUBLICATION xxx FOR TABLE xxx;
CREATE SUBSCRIPTION xxx CONNECTION 'xxx' PUBLICATION xxx;

Best Answer

The issue was that the wal_(receiver|sender)_timeout were too low...

I changed the value from one minute to ten minutes and it works.