Postgresql – FATAL: terminating walreceiver due to timeout

enterprisedbpostgresqlreplication

1/ DESCRIPTION:

  • Machine 1 (slave): Centos 6.6 , x64 , installed PostgreSQL 9.3 (on Local)
  • Machine 2 (master): Centos 6.6 , x64 , installed PostgreSQL 9.3 (on Cloud)

Machine 1 (slave) and machine 2 (master) are in a cluster (streaming replication). Sometime, I see "FATAL: terminating walreceiver due to timeout" in slave log.

Here is full detailed logs:

Slave

2015-03-03 02:01:53 UTC 19693   LOG:  database system is ready to accept read only connections
2015-03-03 02:01:53 UTC 19699   LOG:  started streaming WAL from primary at 0/8000000 on timeline 1
2015-03-03 02:02:15 UTC 19695   LOG:  redo starts at 0/8F04530
2015-03-03 02:39:26 UTC 19699   FATAL:  terminating walreceiver due to timeout
2015-03-03 02:39:26 UTC 19695   LOG:  invalid record length at 0/8F080F8
2015-03-03 02:39:41 UTC 21065   LOG:  started streaming WAL from primary at 0/8000000 on timeline 1
2015-03-03 03:19:12 UTC 21065   FATAL:  terminating walreceiver due to timeout
2015-03-03 03:19:12 UTC 19695   LOG:  invalid record length at 0/9D488F8
2015-03-03 03:19:27 UTC 22489   LOG:  started streaming WAL from primary at 0/9000000 on timeline 1

Master

2015-03-03 02:02:40 UTC 1718   LOG:  database system is ready to accept connections
2015-03-03 02:02:40 UTC 1724   LOG:  autovacuum launcher started
2015-03-03 02:02:42 UTC 1726 [unknown] [unknown]LOG:  invalid length of startup packet
2015-03-03 02:02:42 UTC 1726 [unknown] [unknown]LOG:  connection failed during start up processing: user= database=
2015-03-03 02:35:45 UTC 1788 pgAdmin III - Query Tool enterprisedbERROR:  column "username" does not exist at character 18
2015-03-03 02:35:45 UTC 1788 pgAdmin III - Query Tool enterprisedbSTATEMENT:
        select datname, username, client_addr, client_port, query from pg_stat_activity;
2015-03-03 02:41:03 UTC 1748 walreceiver enterprisedbLOG:  terminating walsender process due to replication timeout
2015-03-03 02:51:42 UTC 3184 ::1 psql.bin enterprisedbERROR:  unrecognized configuration parameter "replication_timeout"
2015-03-03 02:51:42 UTC 3184 ::1 psql.bin enterprisedbSTATEMENT:  show replication_timeout;
2015-03-03 02:51:54 UTC 3184 ::1 psql.bin enterprisedbERROR:  relation "pg_setting" does not exist at character 15
2015-03-03 02:51:54 UTC 3184 ::1 psql.bin enterprisedbSTATEMENT:  select * from pg_setting;
2015-03-03 02:58:33 UTC 3388 [unknown] [unknown]LOG:  invalid length of startup packet
2015-03-03 02:58:33 UTC 3388 [unknown] [unknown]LOG:  connection failed during start up processing: user= database=
2015-03-03 02:58:57 UTC 3390 [unknown] [unknown]LOG:  incomplete startup packet
2015-03-03 02:58:57 UTC 3390 [unknown] [unknown]LOG:  connection failed during start up processing: user= database=
2015-03-03 03:15:04 UTC 3967 [unknown] enterprisedbFATAL:  database "enterprisedb" does not exist
2015-03-03 03:20:53 UTC 2884 walreceiver enterprisedbLOG:  terminating walsender process due to replication timeout

2/ QUESTION:

What is about "FATAL: terminating walreceiver due to timeout" problem ? How can I fix it ?

Best Answer

You need to tune the wal_receiver_timeout parameter to your needs.