PostgreSQL – trigger_file Doesn’t Work

postgresqlreplication

I set streaming replication. Replication works just fine but it doesn't care about trigger_file. I have nothing fancy there except all what many of tutorials advices:

recovery.conf:

standby_mode = on
primary_conninfo = 'host=69.69.69.69 port=5432 user=repl password=some_pass_here'
trigger_file = '/var/lib/postgresql/9.3/main/failover_trigger'

I've checked that the postgres user has access to this file, I've tried to change the location to /tmp, set file owner as postgres or root and nothing helped.

Any ideas ? Thanks in advance.

Best Answer

As a starting point to troubleshoot this, you may check what's read from recovery.conf with log_min_messages set to debug2 in postgresql.conf on the slave.

On server start, the trigger file should be shown in the log within a set of entries like this:

 DEBUG:  standby_mode = 'on'
 DEBUG:  primary_conninfo = 'host=69.69.69.69 port=5432 user=repl password=some_pass_here'
 DEBUG:  trigger_file = '/var/lib/postgresql/9.3/main/failover_trigger'
 LOG:  entering standby mode

If the trigger_file entry doesn't show up, the most plausible explanation would be that you're editing a recovery.conf at a wrong location.

If on the other hand it's found at startup, when later creating the trigger file to fail over, this entry should appear:

 LOG:  trigger file found: /var/lib/postgresql/9.3/main/failover_trigger

Related Solutions

Postgresql – Does PostgreSQL 9.1 Streaming Replication catch up after a lag without WAL archiving

Yes, it will catch up, using streaming only, if (and only if), the number of WAL segments generated since the last update on the standby is less than the value of wal_keep_segments in postgresql.conf. This is covered in this section of the documentation: Replication

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

The message "The database system is starting up." does not indicate an error. The reason it is at the FATAL level is so that it will always make it to the log, regardless of the setting of log_min_messages:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHEN

After the rsync, did you really run what you show?:

pgsql -c "select pg_stop_backup();";

Since there is, so far as I know, no pgsql executable, that would leave the backup uncompleted, and the slave would never come out of recovery mode. On the other hand, maybe you really did run psql, because otherwise I don't see how the slave would have logged such success messages as:

Log: consistent recovery state reached at 0/BF0000B0

and:

Log: streaming replication successfully connected to primary

Did you try connecting to the slave at this point? What happened?

The "Success. You can now start..." message you mention is generated by initdb, which shouldn't be run as part of setting up a slave; so I think you may be confused about something there. I'm also concerned about these apparently conflicting statements:

The only ways I have restarted Postgres is through the service postgresql-9.1 restart or /etc/init.d/postgresql-9.1 restart commands. After I receive this error, I kill all processes and again try to restart the database...

Did you try to stop the service through the service script? What happened? It might help in understanding the logs if you prefixed lines with more information. We use:

log_line_prefix = '[%m] %p %q<%u %d %r> '

The recovery.conf script looks odd. Are you copying from the master's pg_xlog directory, the slave's active pg_xlog directory, or an archive directory?

Best Answer

Related Solutions

Postgresql – Does PostgreSQL 9.1 Streaming Replication catch up after a lag without WAL archiving

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

Related Question