PostgreSQL Failover – What tools should I use

failoverpostgresqlreplication

Here is the scenario:

There are two machines running CentOS 6.2 – machine0 and machine1

Both have PostgreSQL 9.1 installed.

One of them should be active, as a master system and through asynchronous streaming replication the other machine, the standby should copy changes to the database from the master system.

Assuming that machine0 is the master and machine1 is the standby at the beginning.

If the master (say machine0) fails (failure here means the postgresql server crashes) the standby should takeover from the master and become the new master.

machine1, the new master handles all the database operations and when the postgresql server in machine0 gets back online, it should become a standby, start synchronizing from the point it lost contact with machine1 and copy all changes to the database and stay in standby mode.

When machine1 fails, the whole cycle repeats.

When the standby fails and comes back online, it should start reading from the master and synchronize data.

I'm confused as to what are tools that I need to use for setting this up as I understand PostgreSQL does not come with failover by default.

If someone can link me to threads/pages that describe how to do what I am trying, I'll be really grateful.

Best Answer

If you are interested in synchronous DB Replication and Failover, I have a suggestion. You could look into using two tools:

  • DRBD (Disk Replicated Block Device) provides Disk Level Replication (synchronous) between disks on two different servers. The disk changes are replicated over the network. It is best to use a crossover cable on eth2 for the NICs using the 192.168.x.x netblock.

  • ucarp allows for DBVIP management and failover between two different servers.

You can use them in conjunction as follows:

  • DBServer1 has IP 10.1.2.30
  • DBServer2 has IP 10.1.2.40
  • DB VIP is 10.1.2.70

Setup ucarp on two different servers in such a way that each ucarp instance communicates with the other along a virtual router ID

DBServer1 will have

/usr/local/sbin/ucarp -v 200 -p sometagname --interface=eth2 -a 10.1.2.70 -s 10.1.2.30 -b 3 -r 5 --upscript=/usr/local/sbin/vip-up.sh --downscript=/usr/local/sbin/vip-down.sh --neutral -z -B

DBServer2 will have

/usr/local/sbin/ucarp -v 200 -p sometagname --interface=eth2 -a 10.1.2.70 -s 10.1.2.40 -b 4 -r 5 --upscript=/usr/local/sbin/vip-up.sh --downscript=/usr/local/sbin/vip-down.sh --neutral -z -B

What's the reason -b (broadcasts) is 3 on one server and 4 on the other? In case both servers are restarted, the server with the lowest -b will bring ucarp first. As for -r, that is dead ratio. Thus, DBServer1 will come up in 15 seconds (3X5) and DBServer2 comes up in 20 seconds (4X5).

Let's say DBServer1 crashes. ucarp on DBServer2 will check for 20 seconds for a handshaking fro ucarp on DBServer1. If nothing is detected, the vip-up.sh script on DBServer2 will take control of the DBVIP.

OK all of this is just for failover and DBVIP management. What about the PostgreSQL database? Surprisingly, PostgreSQL is only running on one server at a time. Whoever has the DBVIP should have DRBD in Primary state. This ties back in to the vip-up script. How do you script it?

Here is /usr/local/sbin/vip-down.sh script (conceptual)

#! /bin/sh
exec 2> /dev/null
/sbin/drbdadm disconnect drbd0
/sbin/drbdadm  primary drbd0 &&
/bin/mount postgres-data-folder /mnt/drbd0
/sbin/ip addr add 10.1.2.70/24 dev eth2
/sbin/service postgres start &&
/usr/local/sbin/vipmon.sh &
touch /tmp/vip-up

Here is /usr/local/sbin/vip-down.sh script (conceptual)

#! /bin/sh
exec 2> /dev/null
/sbin/service postgres stop
/bin/umount -l  /dev/drbd0
/sbin/drbdadm secondary drbd0
/sbin/ip addr del 10.1.2.70/24 dev eth2
/bin/rm /tmp/vip-up
/usr/bin/killall -9 ucarp
/usr/local/sbin/vip-down.sh
kill `ps auxww | grep vipmon | awk '{print $2}'`

All you need to do is setup pg_hba.conf to make sure all users come via 10.1.2.70

Essentially, vip-up performs this sequence

  • Have DRBD go Primary
  • Mount postgres data folder on /dev/drbd0
  • Startup postgres
  • start vipmon process

Contrawise, vip-down performs this sequence

  • Shutdown postgres
  • unount /dev/drbd0
  • Have DRBD go Secondary

What about vipmon.sh? You could script that to check, in an indefinite loop, the state of postgres (it is running), check the DRBD device (can you still write to the posts data folder)

With this setup, you have postgres on the DRBD Primary and a disk level copy of the postgres data folder on the other DBServer (the DRBD Secondary). postgres is not running on the DRBD Secondary.

When a failover happens, you just need time for ucarp to detect is is safe to mount the postgres data, startup postgres, and acivate the vipmon script.

What great about this setup is that in the event of a failover, the DBServer that becomes DRBD Primary should have a 100% disk level copy of the server you failed away from. Thus starting postgres should level you at a consistent state. The transaction logs (in pg_xlog) should be in tact (less the intermittency due to failover)

I hope these suggestions help. Although I am a MySQL DBA, I use MySQL and DRBD at my employer's web hosting company on a regular basis. I installed MySQL/DRBD in the aforementioned manner. I did this once for a client using PostgreSQL/DRBD. It works great. It's Open Source. You just need to perform due diligence in learning DRBD and ucarp. This would include reconnecting DRBD after a failover, handling split brain scenario where both DB Servers go Primary, and things like these.