Postgresql – How to verify that postgres 9.3 db has no corruption

corruptiondata integritypostgresqlpostgresql-9.3

I have a postgres 9.3 db running on ubuntu server.
About a month ago we've had some problems with the hardware on the server which was resolved by our VPS hosting company.

The problem was resolved fairly quickly and everything seemed to be working fine.
We have backups running using barman that is on a different server – and backups and restores were working (I checked).

The first sign of data corruption was few days ago: I decided to do a full pg_dump on our DB as I do every once in a while, and it failed (invalid page header in block…) – The data seemed to have been corrupted a long time ago – around the time of the hardware problem (that was the date on the corrupted record). I resorted to finding the corrupted record and I deleted it and restored it manually.

After that I was able to do a full pg_dump.

In order to check for additional corruption – I've set up different db server from a backup and run pg_repack on all the tables to verify that I'm able to rebuild all the indexes and tables.

My questions are:
1. How can I be certain that I don't have any additional corruption in my DB?

2. What can I do to periodically check my data integrity?
3. What else can I do to verify the integrity of our DB besides dumping the whole DB and re-indexing it (which I already did)?

P.S – I don't have block checksums enabled.

Best Answer

If you have initialized your cluster with checksum, you can control this while backuping with pg_backrest.

PostgreSQL has supported page-level checksums since 9.3. If page checksums are enabled pgBackRest will validate the checksums for every file that is copied during a backup.

You will find more information about cheksums here: https://www.postgresql.org/docs/current/static/app-initdb.html#APP-INITDB-DATA-CHECKSUMS

You can also make a pg_dump to /dev/null (if you don't have enough space to keep a logical dump). This should ensure you that you can read each data block.

PGDG had a discussion this year beacuse some wanted to activate checksum by default. You can find it there: http://www.postgresql-archive.org/Checksums-by-default-td5940158.html