I got one main server and other two servers are replicating from the main.
I got full disk space on the main server (I have an archive_command
that put wal archives into the /wal_archive/
directory).
It got full, and the PostgreSQL log shows the error:
archive command failed:
2019-06-04 09:52:49.079 EEST [3365] LOG: archive command failed with exit code 1
2019-06-04 09:52:49.079 EEST [3365] DETAIL: The failed archive command was: test ! -f /wal_archive/000000010000028C0000003B && cp pg_xlog/000000010000028C0000003B /wal_archive/000000010000028C0000003B
I cleaned up the /wal_archive/
directory with pg_archivecleanup
, but PostgreSQL still throws the same error.
I have tried to reload PostgreSQL config without making changes to the config itself and without restart using:
/etc/init.d/postgresql reload
but still the same, in log, PostgreSQL throws an error.
How should I resume wal_archive copying to the wal_archive directory?
Should I change archive_command
to true
, reload, and change archive back to original again?
I'm trying to avoid restarting the server itself.
Best Answer
It seems like archiving managed to partly write the WAL archive file before the space ran out.
Then the
test
in yourarchive_command
will notice that there is already a fike of that name and will fail.In this case the solution would be to manually remove that partially archived WAL segment so that the next attempt to archive it can succeed.
You might want to improve your
archive_command
by removing the file ifcp
fails (while still returning a non-zero return code).