OS: CentOS 7.3
PostgreSQL: 9.6
/etc/hosts
:
192.168.0.1 node1
192.168.0.2 node2
192.168.0.3 # VIP
192.168.0.4 # VIP
Follow this guide did PostgreSQL replication:
I used this way to do first initial database copy:
psql -c "select pg_start_backup('initial_backup');"
rsync -cva --inplace --exclude=*pg_xlog* ~postgres/9.6/data/ 192.168.0.2:~postgres/9.6/data/
psql -c "select pg_stop_backup();"
Then edited pg_hba.conf
, postgresql.conf
to the right IPs.
Then created a recovery.conf
file inserted this:
standby_mode = 'on'
primary_conninfo = 'host=192.168.0.1 port=5432 user=rep password=password'
restore_command = 'cp /var/lib/pgsql/9.6/pg_archive/%f %p'
recovery_target_timeline = 'latest'
Follow official guide:
https://wiki.clusterlabs.org/wiki/PgSQL_Replicated_Cluster
Set config.pcs
:
pcs cluster cib pgsql_cfg
pcs -f pgsql_cfg property set no-quorum-policy="ignore"
pcs -f pgsql_cfg property set stonith-enabled="false"
pcs -f pgsql_cfg resource defaults resource-stickiness="INFINITY"
pcs -f pgsql_cfg resource defaults migration-threshold="1"
pcs -f pgsql_cfg resource create vip-master IPaddr2 \
ip="192.168.0.3" \
nic="ens192" \
cidr_netmask="24" \
op start timeout="60s" interval="0s" on-fail="restart" \
op monitor timeout="60s" interval="10s" on-fail="restart" \
op stop timeout="60s" interval="0s" on-fail="block"
pcs -f pgsql_cfg resource create vip-rep IPaddr2 \
ip="192.168.0.4" \
nic="ens192" \
cidr_netmask="24" \
meta migration-threshold="0" \
op start timeout="60s" interval="0s" on-fail="stop" \
op monitor timeout="60s" interval="10s" on-fail="restart" \
op stop timeout="60s" interval="0s" on-fail="ignore"
pcs -f pgsql_cfg resource create pgsql pgsql \
pgctl="/usr/pgsql-9.6/bin/pg_ctl" \
psql="/usr/pgsql-9.6/bin/psql" \
pgdata="/var/lib/pgsql/9.6/data/" \
rep_mode="sync" \
node_list="node1 node2" \
restore_command="cp /var/lib/pgsql/9.6/pg_archive/%f %p" \
primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \
master_ip="192.168.0.4" \
restart_on_promote='true' \
op start timeout="60s" interval="0s" on-fail="restart" \
op monitor timeout="60s" interval="4s" on-fail="restart" \
op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
op promote timeout="60s" interval="0s" on-fail="restart" \
op demote timeout="60s" interval="0s" on-fail="stop" \
op stop timeout="60s" interval="0s" on-fail="block" \
op notify timeout="60s" interval="0s"
pcs -f pgsql_cfg resource master msPostgresql pgsql \
master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs -f pgsql_cfg resource group add master-group vip-master vip-rep
pcs -f pgsql_cfg constraint colocation add master-group with Master msPostgresql INFINITY
pcs -f pgsql_cfg constraint order promote msPostgresql then start master-group symmetrical=false score=INFINITY
pcs -f pgsql_cfg constraint order demote msPostgresql then stop master-group symmetrical=false score=0
pcs cluster cib-push pgsql_cfg
Load configuration
sh config.pcs
Check status again:
crm_mon -Afr -1
Result:
Stack: corosync
Current DC: node1 (version 1.1.15-11.el7_3.5-e174ec8) - partition with quorum
Last updated: Thu Jul 27 07:29:35 2017 Last change: Thu Jul 27 07:13:11 2017 by hacluster via crmd on node1
2 nodes and 4 resources configured
Online: [ node1 node2 ]
Full list of resources:
Master/Slave Set: msPostgresql [pgsql]
Masters: [ node2 ]
Slaves: [ node1 ]
Resource Group: master-group
vip-master (ocf::heartbeat:IPaddr2): Started node2
vip-rep (ocf::heartbeat:IPaddr2): Started node2
Node Attributes:
* Node node1:
+ master-pgsql : -INFINITY
+ pgsql-data-status : DISCONNECT
+ pgsql-status : HS:alone
* Node node2:
+ master-pgsql : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 0000000011000140
+ pgsql-status : PRI
Migration Summary:
* Node node1:
* Node node2:
Why node1 can't connected.
And at this time check recovery.conf
file on both nodes:
node1:
standby_mode = 'on'
primary_conninfo = 'host=192.168.0.4 port=5432 user=postgres application_name=node1 keepalives_idle=60 keepalives_interval=5 keepalives_count=5'
restore_command = 'cp /var/lib/pgsql/9.6/pg_archive/%f %p'
recovery_target_timeline = 'latest'
node2:
(null)
Best Answer
Whenever something goes wrong, you have to look at the logs. You will find warnings, error, fatal, and panic messages.
You can find where your logs are in your postgresql.conf file.
Look for the
logging_collector
setting, if it's on, you will find your server logs in the directory specified in thelog_directory
setting.If
logging_collector
is set to off, look at thelog_destination
setting. If it's syslog you need to look at your syslog settings to find where your logs are. If it's stderr you might find something under/proc/<PID>/fd/2
where is the PID of your running PostgreSQL server.You might find this page of documentation usefull.