Postgresql – How to propely perform a recovery

backuppostgresqlrecovery

So, let me give you some context.

I started to work for this company just a few weeks ago, and I'm a jr sysadmin, not a database administrator. Actually, my boss started the same time. The previous boss did not knew what he was doing, so the company hired us to fix a lot of things.

We have a partner that developed the main system we use here and the database(postgres) backup routine was automated by then (old boss had no clue how to do it). My boss asked this partner if the backup/recovery routine was working, but the response was not confident, so he asked me if I could performe a test.

I've done backup/recoveries in databases before, so no problem, but when I got the backup file, to my surprise, it was not a pg_dump file, but a bunch of folders named like 'base', 'pg_clog', 'pg_logical', etc. So, their backup was just a shell script that copy all database folders. That base folder alone is more than 32Gb.

My question is: is this a standard or acceptable way of doing a database backup? I ask because I was wondering if, in case a fatal hardware failure happens in the server, how would I recover this database in another server from this folders?

Best Answer

It could be a "cold backup" where someone shuts down the database and copies the PGDATA directory files. If that is the case, it should be a valid backup. You can test it by copying the entire directory to some test machine with the same architecture as the database server, taking that test machine off the network (in case it tries to reach out and touch someone when you start it up), and then start it up. You can also run the program pg_controldata against this directory. If the database was shutdown at the time the backup was taken, you will get a line like this:

Database cluster state:               shut down

This suggest it was a cold backup and should be valid, but it is not conclusive proof.

It could be a correctly taken hot-backup. If this is the case, the directory should have a file named backup_label. But again, this is not conclusive that the backup is complete and was correctly taken. You need to test the backup. The hot backup could be entirely self-contained, all the needed WAL being located in pg_xlog. Or, it could be usable only in conjunction with an external WAL archive.

Finally, it could be that someone ran cp -r or similar against the database when it was in service without going through the proper steps, and created a useless backup.

You should be able to look at the backup script to at least see what it is attempting to do, that should be easier than doing a from-scratch forensic inspection of the back up directory.