Postgresql – timelines and history files PostgreSQL 9.1

postgresql

What is the purpose of timelines, history files, and the setting 'recovery_target_timeline' in recovery.conf?

The vague understanding I have from the PostgreSQL 9.1 documentation is when the slave completes recovery it switches to a new timeline to prevent overwriting of the WALs of a previous timeline.

I am not clear how this is used in a recovery scenario, and the purpose of the .history file and the 'recovery_target_timeline' setting to 'latest'.

Best Answer

When the slave is promoted to a new master it creates a new timeline to avoid WAL names overlapping. The .history file contains information about database timeline branches. Recovery process uses this information to determine the timeline it is working with. By default it uses the same timeline that was when base backup was made. So, for example, if you promote one of your slaves to a new master other slaves will not be able to continue recovery from the new master as they use previous timeline.

To make them able to use current timeline, in easy words to switch them to the new master, you need them to know about the new history. Just delete everything in the pg_xlog directory on slaves and copy the history file from the new master. And then set recovery_target_timeline to 'latest' to force the database using the latest found timeline.

Note that the slave pretending to be a new master must be the most catch up one. Otherwise it might lead to data corruption on other slaves after you switch them to the new master.

Related Solutions

PostgreSQL failover and replication

slaves wont understand new master. you should manually do that.
yes they are different and you should create new ones for old master.however old standby will continue to work as a master but you should set max_wal_senders on that node. you should also set pg_hba.conf of the new master after failover. after failover (when nodes changes roles master->slave slave->master), you should transfer new wal files to new standby folders data directory which you set in recovery.conf file. or simply you can use rsync.
may be you can use pgbouncer. this way you will just change the pgbouncer server adres to new master.
EnterpriseDB has some commercial tools. may be you can check them.

and finally yes you are right. there is no single fully automatic solution to solve these questions.

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

The message "The database system is starting up." does not indicate an error. The reason it is at the FATAL level is so that it will always make it to the log, regardless of the setting of log_min_messages:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHEN

After the rsync, did you really run what you show?:

pgsql -c "select pg_stop_backup();";

Since there is, so far as I know, no pgsql executable, that would leave the backup uncompleted, and the slave would never come out of recovery mode. On the other hand, maybe you really did run psql, because otherwise I don't see how the slave would have logged such success messages as:

Log: consistent recovery state reached at 0/BF0000B0

and:

Log: streaming replication successfully connected to primary

Did you try connecting to the slave at this point? What happened?

The "Success. You can now start..." message you mention is generated by initdb, which shouldn't be run as part of setting up a slave; so I think you may be confused about something there. I'm also concerned about these apparently conflicting statements:

The only ways I have restarted Postgres is through the service postgresql-9.1 restart or /etc/init.d/postgresql-9.1 restart commands. After I receive this error, I kill all processes and again try to restart the database...

Did you try to stop the service through the service script? What happened? It might help in understanding the logs if you prefixed lines with more information. We use:

log_line_prefix = '[%m] %p %q<%u %d %r> '

The recovery.conf script looks odd. Are you copying from the master's pg_xlog directory, the slave's active pg_xlog directory, or an archive directory?

Best Answer

Related Solutions

PostgreSQL failover and replication

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

Related Question