Postgresql – Postgres 9.2 – postgres – FATAL: XX000: relation mapping file “global/pg_filenode.map” contains incorrect checksum

We have a ~400GB Postgres database (v9.2.6) (running on CentOS 6.5) that had a problem with free space and with every command (ls, pwd etc) we were seeing a segmentation fault error. We were able to read the disk and so we copied the pg_data directory and rebuilt the server. In an attempt to place the pg_data back and start the Postgres service we saw the error:

postgres - FATAL:  XX000: relation mapping file "global/pg_filenode.map" 
contains incorrect checksum

I know the easiest answer would be restore the latest backup. Well, as you would know it, the last pg_base_backup was just over 3 weeks ago and we only retain 1 weeks worth of WAL logs. So I do not have enough logs to bring the backup up to date. I know this is a flaw here, I know I learned some lessons.

I tried to use the pg_filenode.map file from the 2 week old backup, but it resulted in the same checksum error. I will note that a FULL VACUUM was done over this past weekend on one of the larger tables but completed before the failure.

Is there any chance I can recover from this error? Any ideas on possibly reading some of the smaller table data / schema / functions / views to help me in the rebuild process of a new clean slate database.

Best Answer

Before I did anything, I would make a copy of the cluster as is and keep it safe, following the directions here:

https://wiki.postgresql.org/wiki/Corruption

Once you have your master copy safe, you might try a fresh initdb in another directory, and copy over the pg_filenode.map file, as suggested here, and see if that helps start your database:

http://tapoueh.org/blog/2013/09/16-PostgreSQL-data-recovery

If that doesn't work, or you encounter more errors, I would try pgsql-general@postgresql.org for more suggestions, or engaging a PostgreSQL consultancy to try and recover your data.

If it does work, I would do a pg_dump immediately, then initdb a new cluster on the new machine, and reload the pg_dump, just to make sure there isn't anything else lurking that could cause problems.

My sympathies on your issue, and best of luck getting it resolved.

Best Answer

Related Solutions

SQL Server Recovery – Reattaching MDF Files from RAID 5

PostgreSQL DELETE FROM Error – Attempted to Delete Invisible Tuple