We've had to move around the master a bit. It started on server 01
got moved to 02
(which was a slave) We need to move it again so we built 04
and trying to slave it off 02
and getting the following errors.
2018-02-25 17:00:08 UTC FATAL: highest timeline 3 of the primary is behind recovery timeline 4
2018-02-25 17:00:13 UTC FATAL: highest timeline 3 of the primary is behind recovery timeline 4
2018-02-25 17:00:18 UTC FATAL: highest timeline 3 of the primary is behind recovery timeline 4
The initial dump happened like
pg_basebackup --verbose --progress -d "host=10.132.x.x user=backup password=...." -D /var/lib/postgresql/9.4/main/ -l 'instance restore' --xlog-method=stream
recovery file looks like
restore_command = 'if [ -f /srv/postgresql/archive/${DATASET}/%f ]; then cp /srv/postgresql/archive/${DATASET}/%f %p; else aws s3 cp --quiet s3://company-backups/postgresql/${DATASET}/archive/%f %p; fi'
standby_mode = 'on'
primary_conninfo = 'host=10.132.x.x user=backup password=....'
recovery_target_timeline = 'latest'
trigger_file = '/var/lib/postgresql/9.4/main/failover'
Best Answer
As it sounds, you are in a split brain situation. The original master (01) was never stopped from being master, and after the promotion of 02, it became just another master.
Fixing such issues pre-9.5 is not so easy (at that version
pg_rewind
became an element of the PostgreSQL ecosystem) - you will need some manual cleanup, most probably. What is certain is if you got writes to 01 after promotion of 02, they will be lost (or the writes on 02, depending what you choose to do).I'd take a logical dump from both 01 and 02 to start (to check if there is anything that has to be manually replayed from 01 to 02), stop 01 altogether, remove the older timelines WAL segments from the archive (well, you can move them somewhere else just in case) and then try to build a slave based on 02 again.
You can also use
pg_xlogdump
to see which relations (tables, indexes, etc.) got writes since the split brain started. (Note that from version 10 the utility name ispg_waldump
.)