Postgresql – pgpool-II doesn’t execute the failover script

failoverpgpoolpostgresqlscripting

I have been installed pgpool-II 3.3.4 from source. I have two pgpool nodes and I also used the watchdog configuration to get a pgpool services with High Availability. When I stop one, the other wake up and everything is fine.

Also I have two backends, one master and one slave, then I have configured the failover command. Here is my problem, when I stop the master, pgpool doesn't execute the failover.

There are my backends:

backend_hostname0 = '172.23.0.70'
backend_port0 = 5432
backend_weight0 = 0
backend_data_directory0 = '/var/lib/postgresql/9.3/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = '172.23.10.21'
backend_port1 = 5432
backend_weight1 = 0
backend_data_directory1 = '/var/lib/postgresql/9.3/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'

There is my failover command declaration:

failover_command = 'failover.sh %d %M %m'

Of course, I have been copy the failover to /sbin/ directory in pgpool nodes.
There is the failover.sh script code:

#!/bin/sh

FALLING_NODE=$1

# The new master
SLAVE1="172.23.10.21"

if test $FALLING_NODE -eq 0
then
ssh -T postgres@$SLAVE1 "touch /tmp/postgresql.trigger.5432"
ssh -T postgres@$SLAVE1 "while test ! -f /var/lib/postgresql/9.3/main/recovery.done; do sleep 1; done;"
ssh -T postgres@$SLAVE1 "/etc/init.d/postgresql restart"
/usr/local/bin/pcp_attach_node 10 localhost 9898 pgpool pgpool 1
fi

Now, when I run the failover.sh manually, it works, so why it doesn't run with pgpool?

I run pgpool with sudo user so:

  1. Do I need to run pgpool with other user?
  2. Do I need to create an specific user to pgpool configuration? It works manually with sudo!!!.
  3. Do I have to change my failover script command?

Best Answer

I had issues with getting the script to run as well. But eventually figured it out.

A few high level steps to keep in mind.

In your failover script redirect results to a log file like:

ssh -T postgres@$SLAVE1 "touch /tmp/postgresql.trigger.5432" > /tmp/failover.log 2>&1

This might help you identify why it is failing.

However, here is a high level overview of how I managed to get the failover script running:

  • install failover script on pgpool server
  • pgpool is running under root. So...
  • as user root run ssh-keygen -t rsa on the pgpool server and cat id_rsa.pub and >> to authorized_keys on the db servers

Doing so will allow the script to run without interactive authentication, which is what my problem was.