I'm testing and trying to figure out how the backup and recovery strategy in PostgreSQL 11 works but it does not work as expected. When I recover the backup, I don't get the correct state. I get any state after a given timestamp. In detail:
I have this setup in database:
wal_level = replica
max_wal_senders = 10
archive_mode = on
I have executed the following commands:
postgres=# select pg_create_physical_replication_slot('slot1');
pg_basebackup -D /media/extern/postgresql_basebackup/
pg_receivewal -D /media/extern/postgresql_archive -S slot1 -v
pg_receivewal is writing into postgresql_archive, this seems to work.
- then I insert something into the database, let's say at 16:21.
- I wait some minutes, let's say until 16:27 and delete all the insert.
Then I want to recover the inserts by doing…
service postgresql stop
(do I need to stop pg_receivewal as well)?
# mv main main.before_recovery
# cp -rp /media/extern/postgresql_basebackup main
nano /var/lib/postgresql/11/main/recovery.conf
I add the following content
restore_command = 'cp -r /media/extern/postgresql_archive/%f %p'
recovery_target_time='2018-01-06 16:22:00'
Then I execute
# chown postgres:postgres recovery.conf
# chmod 700 recovery.conf
service postgresql start
After startup, the file changes to recovery.done, but the state in database is wrong. It didn't restore the data. The database is still empty, so the PITR didn't work, why?
What should I do In case if everything was successful after recovery?. I'm unsure about these questions:
- Creating a new slot2 by executing pg_create_physical_replication_slot('slot2'); and streaming from this slot by pg_receivewal -D /media/extern/postgresql_archive -S slot2 -v?
- Doing a pg_receivewal -D /media/extern/postgresql_archive -S slot1 -v again?
- Do I have to delete anywhen any data in postgresql_archive? (-> Probably not)
- When do I have to create another base backup?
- When and why do I execute select pg_switch_wal(); in this recovery concept?
- How do I deal with new 'timelines' after recovery? I understand the concept of the timelines, but I'm unsure how I should proceed with it after recovery.
[UPDATE]
jjanes pointed out, that the date is wrong. I corrected it, but I still get an empty database after recovery. I tried these steps after I recovered the database, which is now empty:
2019-01-08, 18:50: create a new base backup from this empty database
to have a good starting point.2019-01-08, 18:55 created slot1
pg_receivewal -D /media/extern/postgresql_archive -S slot1 -v
2019-01-08, 18:57 insert
2019-01-08, 19:38 delete
service postgresql stop
# mv main main.before_recovery
# cp -rp /media/extern/postgresql_basebackup main
# chmod 700 main
# nano /var/lib/postgresql/11/main/recovery.conf
restore_command = 'cp -r /media/extern/postgresql_archive/%f %p'
recovery_target_time='2019-01-08 19:20:00'
# chown postgres:postgres recovery.conf
# chmod 700 recovery.conf
# service postgresql start
(btw: I saw that it starts by itself new timelimes after recovery.)
[UPDATE]
I activated in postgresql.conf
checkpoint_flush_after = 256kB
checkpoint_timeout = 30s
I tried to restore the database by using the previous method but without recovery_target_time and without any DELETE statement. I just wanted to look if the current full state can be rebuilt at all. Additionally I renamed the current wal file 00000007000000000000002D.partial to 00000007000000000000002D after stopping the database and before recovering the data.
I'm currently not sure which of these things worked, but at least, the non-PITR recovery seem to work, because I was able to restore the latest data by installing the (empty) base backup and the wal files. At least a step into the right direction, but this only works when you don't miss any shipped wal files. Here, question 2 comes into play: What is the correct workflow to handle the "next shipping iteration".
At first, you can't start pg_receivewal -D /media/extern/postgresql_archive -S slot1 -v
again without deleting the shipped WALs in the external folder. Otherwise it would not start shipping the logs and it would abort with connection errors. This means that you can throw away your old base backup, because when you try the recover again with the old base backup, you would miss all the wal logs between base backup and last recovery point (because you deleted them). That also means that you have to create a new base backup immediately from the recovery point state. When you start log shipping again, you have no "gap" then between the new base backup and the wal files.
[UPDATE]
I got PITR running with the same approach, so I was able to "undo" a delete now. After recovery and service postgresql start
, the database remains in read only mode. I had to execute select pg_wal_replay_resume();
to make it RW again. As already mentioned, I can't do WAL shipping again. When I don't modify the archive folder and try to restart the WAL to ship inside the same folder, I get the following exception:
pg_receivewal: could not send replication command "START_REPLICATION":
ERROR: requested starting point 0/34000000 on timeline 11 is not in
this server's history DETAIL: This server's history forked from
timeline 11 at 0/3310ADA8.
Sure, I could do log shipping to a new archive folder or something, but I guess this is not the professional way to do that. What is the common used strategy to resume log shipping after recovery?
Best Answer
I've done a test following these guidelines:
and PITR worked as aspected. Following what I did with comments. I have a fresh postgres installation where I created a db (test) and one table t1 whith 200 rows created with a simple loop. Contents:
STEP 1), 2), 3)
First start from a clean situation:
Create a backup:
I started pg_receivewall with "-n" to make it stop when postgres server goes down, and put it in background.
STEP 4)
SLEEP two minutes.........
STEP 5)
Note how pg_receivewal stops itself because of parameter "-n" during start.
Let's check the contents in the most recent WAL (XXX4):
In the last line, you can see the commit timestamp for the delete: COMMIT 2019-01-11 15:33:31.342763 CET. This is the timestamp we need to avoid to reach, in order to reject the delete statement. So the recovery.conf will be edited with a timestamp sometimes before this one.
After deleting PGDATA and restoring backup files from HOT-backup path into PGDATA, I edited recovery.conf and restarted postgres.
STEPS 6), 7), 8):
Look at the server log, I left only important lines:
We can see that the recovery process reached a consistent state at lsn 0/30000F8, which is in WAl file XXX3: following WAL XXX4 has been copied but not applied because of the timestamp I configured in recovery.conf.
Of course the rows deleted are now present into the db, so I executed pg_wal_replay_resume() to bring the db RW again:
Now wal files moved to timeline 2:
I did a similar test using an archive_command insted of pg_receivewal and obtained identical results. I suggest you to put the db log in debug, and to check with pg_waldump what is inside WAL files. Something strange is happening to you, and it could be interesting catch out WHAT is happening.
Hope this help. MarcoP
Some answers added on 15 Jan.
What should I do In case if everything was successful after recovery?. I'm unsure about these questions:
Creating a new slot2 by executing pg_create_physical_replication_slot('slot2'); and streaming from this slot by pg_receivewal -D /media/extern/postgresql_archive -S slot2 -v?
This could be a choice. It's up to you to decide which techology best fit your requirements. We do not user pg_receivewal at all, we prefer to use an archive_command.
Doing a pg_receivewal -D /media/extern/postgresql_archive -S slot1 -v again?
No, you cant! replication slost are not included in backup by pg_basebackup. If you look at serverlog during basebackup you will see:
so, when you restore the PGDATA from hot-backup you will not restore any replication slot.
When do I have to create another base backup?
It really depends on your requirements. What are your constraints about Recovery Time Objective and Recovery Point Objective?
How many Wal files your db switches in a single day?
From the doc: Since you have to keep around all the archived WAL files back to your last base backup, the interval between base backups should usually be chosen based on how much storage you want to expend on archived WAL files. You should also consider how long you are prepared to spend recovering, if recovery should be necessary — the system will have to replay all those WAL segments, and that could take awhile if it has been a long time since the last base backup.
When and why do I execute select pg_switch_wal(); in this recovery concept?
This command basically force wal file to shitch to the following one. This grants that the last completed wal is moved to archive dir. In case of disaster where the live PGDATA is lost (and so the pg_wal subdir) this can be important. Wal file are 16MB by default. archive_command copies the wal to the archive destination when switch occurs.
If a disaster happens, let say a single byte before the switch, you lose 16MB of transaction activity. This is why someone prefer pg_receivewal which does not wait for wal completition. BTW you can user archive_timeout to set a value which is compatible with your requirements.