PostgreSQL: Unable to run repmgr cloned database

postgresqlreplicationrepmgr

I'm running tests with PostgreSQL hot standby with 1 master, and exactly 1 slave.

I am using the instructions on this guide: http://www.howtoforge.com/how-to-set-up-a-postgresql-9.0-hot-standby-streaming-replication-server-with-repmgr-on-opensuse-11.4

I'm using PostgreSQL version 9.1, repmgr 1.1.0 and Ubuntu 10.04 LTS.

I followed steps upto step-6 in the guide where I ran

repmgr -D /var/lib/postgresql/9.1/main -d pgbench -p 5432 -R postgres --verbose standby clone pgmaster

on pgslave.

Then I did a

/etc/init.d/postgresql start

on it and it the script (seemingly) finished successfully.
However, executing psql throws error:

psql: FATAL: the database system is starting up

Any help on proceeding further is welcome.

Best Answer

I searched around and realized that the problem was due to not enabling hot_standby on the standby server. I wrote a corrected, updated and simplified version of the above article as an odt for future personal reference. Here is the text in its entirety (converted to html from odt using openoffice.org) for those who are interested in setting up a single read-only clone of a database using repmgr:

Single Slave Streaming Replication with PostgreSQL

Introduction

This guide aims to quickly help you configure a PostgreSQL 9.1 server with a database, and have it replicated to a slave that can be used for read-only queries. There is no concept of failover involved here, and the slave will only have a read-only copy of the master's data.

1. Tools Needed

A stable GNU/Linux Distribution (Recommended OS: CentOS 6.2 x86-64)
PostgreSQL 9.1 – you may install from here: http://yum.postgresql.org/ (also make sure that the following packages or their equivalnets are installed)
postgresql
postgresql-client
postgresql-contrib
postgresql-server
postgresql-server-dev
2 workstations – a master that runs the primary database and a slave that runs the replicated read-only database (for the purpose of this document, their respective IP addresses have been replaced with pgmaster and pgslave so make those changes in /etc/hosts of both machines if you want to follow the following instructions word for word)

2. Installation Check and Password Creation

Run /etc/init.d/postgresql start on both systems to check if PostgreSQL is functional or not.
Run /etc/init.d/postgresql stop on both systems to stop PostgreSQL. We will not be using PostgreSQL till we finish some configuration related tasks.
Set a password for user postgres on both the systems. This user by default has no password, but we need a password to help create an SSH tunnel between pgmaster and pgslave.
Run sudo passwd postgres on both systems and type a new Unix password for both.

3. SSH Tunnel Creation

On the pgmaster do the following:
su postgres
ssh-keygen -t rsa (press enter at every prompt)
ssh-copy-id -i ~/ssh/id_rsa.pub pgslave (you need to enter pgslave's postgres password)
ssh pgslave and see if you are able to login without password
Repeat the above steps on pgslave
su postgres
sh-keygen -t rsa
ssh-copy-id -i ~/ssh/id_rsa.pubaster
ssh pgmaster and see if you are able to login without password
Make sure you log out from the remote machine after you finish checking connectivity

4. Editing postgresql.conf on pgmaster

You need to make the following changes in the file postgresql.conf that resides in the configuration directory inside /etc/postgresql/ on your machine pgmaster:

listen_addresses = '*'
wal_level = hot_standby
checkpoint_segments=30
archive_mode=on
archive_command='cd .'
max_wal_senders=2
wal_keep_segments=5000

5. Editing postgresql.conf on pgslave

You need to make the following change in the file postgresql.conf that resides in the configuration directory inside /etc/postgresql/ on machine pgslave:

hot_standby=on

6. Editing pg_hba.conf on pgmastger

You need to make the following changes in the file pg_hba.conf that resides in the configuration directory inside /etc/postgresql/ on machine pgmaster:

host all all 192.168.5.0/24 trust
host replication all 192.168.5.0/24 trust

7. Adding PostgreSQL bin folder to PATH

There are a bunch of nifty postgresql utilities we will be using here, so lets set the PATH variable so that it knows where to find them

Execute locate pgbench
The output will be something like /usr/lib/postgresql/9.1/bin/pgbench
Excluding final word pgbench, copy the above path and add it to your path variable.
Execute nano ~/.bashrc
add line export PATH+=:/usr/lib/postgresql/9.1/bin/ (or whatever location the locate command revealed)
Close the file and save changes
You may need to log out and login (or open a new shell) for changes to take effect.

8. Loading pgmaster's PostgreSQL server with dummy data

On pgmaster start the PostgreSQL server first: /etc/init.d/postgresql start
We create a test database and load it with some dummy data with the following commands:
1. su postgres
2. createdb pgbench
3. pgbench -i -s 10 pgbench
Alternatively open the database pgbench yourself, create a sample table and insert sample data into it.

9. Erasing pgslave's data and checking pgmaster connectivity

We are going to erase data directory of pgslave, so execute following on that machine:
First stop PostgreSQL server /etc/init.d/postgresql stop
Move into the PostgreSQL default data directory folder: cd /var/lib/pgsql/data (or to the data directory that is default for your installation)
Empty the directory completely with rm -rf *
Now execute psql -h pgmaster -d pgbench and see if you are able to access the database on pgmaster through pgslave.

10. Installing repmgr

Though single slave replication is fairly straightforward, we are using a tool called repmgr to help make the process even more simpler. Here is how you install it:

Grab repmgr source code from http://projects.2ndquadrant.it/sites/default/files/repmgr-1.1.0.tar.gz and copy it to /tmp for installation.
Have the following packages installed to ensure that compiling repmgr is possible:
make
gcc
postgresql-devel
libxslt-devel
pam-devel
libopenssl-devel
krb5-devel

Extract the downloaded archive and enter the source code folder.

Compile with make USE_PGXS=1

Install with make USE_PGXS=1 install

Execute below two commands to check if repmgr is installed correctly:

repmgr --version

repmgrd --version

11. Cloning pgmaster onto pgslave using repmgr

Execute su postgres and login as the postgres user on pgslave
Run this to clone database: repmgr -D /var/lib/pgsql/data -d pgbench -p 5432 -R postgres --verbose standby clone pgmaster
Note that the -D parameter /var/lib/pgsql/data should be replaced with the appropriate location of the data folder on pgslave
Also note that 5432 is the default port PostgreSQL runs in.
When the command finishes executing (it is going to take several seconds to finish if you had used pgbench to insert random data) you may start PostgreSQL on pgslave with /etc/init.d/postgresql start

12. Testing streaming replication

Congratulations! You have successfully configured pgslave to copy pgmaster's database via streaming replication. Note that due to being in continuous recovery mode, pgslave can only be used to execute read-only queries and no insertions or modifications are possible.

Insert some values into sample_table on pgmaster.
execute select * from sample_table; on pgslave's psql prompt after connecting to pgbench database.
If everything worked properly, you should be able to view newly added tupples on the output from pgslave.

Conclusion

We just learned how to quickly configure a PostgreSQL database to act as a read-only mirror to another database. Note that in case the slave goes offline, it will automatically recover data from the master and would be soon up-to-date. Streaming replication is at database level, so you need to execute the command in part 11for each new database that needs to be replicated via streaming replication.

Related Solutions

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

The message "The database system is starting up." does not indicate an error. The reason it is at the FATAL level is so that it will always make it to the log, regardless of the setting of log_min_messages:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHEN

After the rsync, did you really run what you show?:

pgsql -c "select pg_stop_backup();";

Since there is, so far as I know, no pgsql executable, that would leave the backup uncompleted, and the slave would never come out of recovery mode. On the other hand, maybe you really did run psql, because otherwise I don't see how the slave would have logged such success messages as:

Log: consistent recovery state reached at 0/BF0000B0

and:

Log: streaming replication successfully connected to primary

Did you try connecting to the slave at this point? What happened?

The "Success. You can now start..." message you mention is generated by initdb, which shouldn't be run as part of setting up a slave; so I think you may be confused about something there. I'm also concerned about these apparently conflicting statements:

The only ways I have restarted Postgres is through the service postgresql-9.1 restart or /etc/init.d/postgresql-9.1 restart commands. After I receive this error, I kill all processes and again try to restart the database...

Did you try to stop the service through the service script? What happened? It might help in understanding the logs if you prefixed lines with more information. We use:

log_line_prefix = '[%m] %p %q<%u %d %r> '

The recovery.conf script looks odd. Are you copying from the master's pg_xlog directory, the slave's active pg_xlog directory, or an archive directory?

Postgresql – Witnessing standby failure with repmgr

Since we're using streaming replication, writes to the master will no longer go through when the standby fails.

Not necessarily the case. This is only the case when you are setting up synchronous streaming replication. Normally the master will return immediately after it writes its own log files to disk and not wait for the slave. It is then replicated out when the slave requests.

See the synchronous_commit parameter.

I don't see why you need to futz with repmgr for this.

Best Answer

Related Solutions

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

Postgresql – Witnessing standby failure with repmgr

Related Question