Postgresql – Recovery using recovery.conf

postgresqlrestoreUbuntu

I am trying to recover a database using recovery.conf file for making point-in-time copy.

On recovery.conf file, I added a recovery target time as below.

recovery_target_time = '2011-06-16 04:10:00 IST'

But in the logs I can see if it advances the time by 3.5 hours to 07:40 and actual recovery is attempted for this time, not the time I given in recovery_target_time.

2011-06-16 12:53:45 IST LOG:  starting archive recovery
2011-06-16 12:53:45 IST LOG:  restore_command = '/app/postgresinstall/bin/pg_standby -l /dbdata/archive %f %p %r'
2011-06-16 12:53:45 IST LOG:  recovery_target_time = '2011-06-16 07:40:00+05:30'
cp: cannot stat `/dbdata/archive/00000001.history': No such file or directory

Why this difference of 3.5 hours and I need help in figuring out whats going on in deciding the recovery target time?

Update: Here are time details from database

postgres=# SELECT EXTRACT(timezone_hour FROM now()),EXTRACT(timezone_minute FROM now());
 date_part | date_part 
-----------+-----------
         5 |        30
(1 row)

postgres=# select now();
               now                
----------------------------------
 2011-08-12 13:08:19.333068+05:30
(1 row)

OS Date/Time:
postgres@xxxxx:~$ date
Fri Aug 12 13:10:19 IST 2011

I faced the same problem when I try to restore from WAL archives on same server and my backup server. Also I can consistently reproduce this problem on both servers.

So if I need to create a near to current time backup, I use a recovery_target_time 3.5 hours behind current time and start the DB with recovery.conf

Best Answer

DISCLAIMER : I am not a PostgreSQL DBA, though I dabble a lot with it.

You probably need to check your timezone in the OS and in psql.

In the OS run this:

[postgres@radarPG-db1 ~]$ date
Fri Jun 17 12:55:37 EDT 2011

In psql, run the following:

postgres=# SELECT EXTRACT(timezone_hour FROM now()),EXTRACT(timezone_minute FROM now());
 date_part | date_part
-----------+-----------
        -4 |         0

postgres=# select now();
              now
-------------------------------
 2011-06-17 12:54:39.195291-04

Many times, some forget to have postgres default the timezone to that of the OS. If the WAL files also contain the timezone in its internal timestamps, you must align the timezone set in postgres with that of the WAL files.

Something else to consider is the following:

The default behavior of recovery is to recover along the same timeline that was current when the base backup was taken. If you want to recover into some child timeline (that is, you want to return to some state that was itself generated after a recovery attempt), you need to specify the target timeline ID in 'recovery.conf'. You cannot recover into timelines that branched off earlier than the base backup.

UPDATE 2011-06-20 15:08 EDT

This is right from the online documentation. Please look this over and see if you missed any WAL files along the way. If you waited too long to setup the WAL files on the new server, postgres may have deleted them on startup. You may need to find those WAL files. See if they reside back in pg_xlog folder from the old server.

Recovering using a Continuous Archive Backup

  1. Stop the server, if it's running.

  2. If you have the space to do so, copy the whole cluster data directory and any tablespaces to a temporary location in case you need them later. Note that this precaution will require that you have enough free space on your system to hold two copies of your existing database. If you do not have enough space, you need at the least to copy the contents of the pg_xlog subdirectory of the cluster data directory, as it might contain logs which were not archived before the system went down.

  3. Clean out all existing files and subdirectories under the cluster data directory and under the root directories of any tablespaces you are using.

  4. Restore the database files from your base backup. Be careful that they are restored with the right ownership (the database system user, not root!) and with the right permissions. If you are using tablespaces, you should verify that the symbolic links in pg_tblspc/ were correctly restored.

  5. Remove any files present in pg_xlog/; these came from the backup dump and are therefore probably obsolete rather than current. If you didn't archive pg_xlog/ at all, then recreate it, being careful to ensure that you re-establish it as a symbolic link if you had it set up that way before.

  6. If you had unarchived WAL segment files that you saved in step 2, copy them into pg_xlog/. (It is best to copy them, not move them, so that you still have the unmodified files if a problem occurs and you have to start over.)

  7. Create a recovery command file recovery.conf in the cluster data directory (see Recovery Settings). You might also want to temporarily modify pg_hba.conf to prevent ordinary users from connecting until you are sure the recovery has worked.

  8. Start the server. The server will go into recovery mode and proceed to read through the archived WAL files it needs. Should the recovery be terminated because of an external error, the server can simply be restarted and it will continue recovery. Upon completion of the recovery process, the server will rename recovery.conf to recovery.done (to prevent accidentally re-entering recovery mode in case of a crash later) and then commence normal database operations.

  9. Inspect the contents of the database to ensure you have recovered to where you want to be. If not, return to step 1. If all is well, let in your users by restoring pg_hba.conf to normal.