Postgresql – pg_restore: [custom archiver] could not read from input file: end of file

pg-dumppg-restorepostgresql

Not going into details how that happened, all started with a corrupt DB. It was OK on the file-system level, but trying to:

select count(*) from mytable

resulted in the following:

ERROR: could not read block 257798 in file "pg_tblspc/16386/PG_9.6_201608131/16385/16506.1": read only 0 of 8192 bytes

The table is very large (~200GB) and contains some bytea columns with binary data. It is never updated, only inserted into and selected from.
I run pg_dump on it, it was working for a long time, produced a ~200GB file, but failed in the end with a similar error:

pg_dump -Z9 -Fc -d mydatabase -t mytable -v -f /datadir/mytable.backup

pg_dump: Dumping the contents of table "mytable" failed: PQgetResult() failed.

pg_dump: Error message from server: ERROR: could not read block 257798 in file "pg_tblspc/16386/PG_9.6_201608131/16385/16506.1": read only 0 of 8192 bytes

pg_dump: The command was: COPY public.mytable (id, account_id, fetched, col1, col2, col3, col4, col5) TO stdout;

Then I tried to restore it, again it was working a long time, in the end it took ~200GB space on that tablespace, however select count(*) from mytable returns 0.

 pg_restore -Fc -d mydatabase -v /datadir/mytable.backup

pg_restore: processing data for table "public.mytable"

and after a long time

pg_restore: could not read from input file: end of file

On server side this resulted in:

ERROR: canceling statement due to user request

CONTEXT: COPY mytable, line 14497030

STATEMENT: COPY mytable (id, account_id, fetched, col1, col2, col3, col4, col5) FROM stdin

LOG: could not send data to client: Connection reset by peer

FATAL: connection to client lost

It looks like the pg_restore requested to abort the whole operation.

I would like to recover as many rows as possible, I suspect that all the data should be still there since it's basically immutable, and that just the last record is corrupted.

Is there a way to force Postgres to keep the records which were restored up till the failed one?

Best Answer

I would suggest you try to do something along the line of:

SELECT count(*) FROM mytable WHERE my_primary_key < value ;

... and find at which point it fails. [This may be tedious, but you can just cut in half repeatedly, until you find the value that first fails.]

You will most probably need to alter the different settings that affect index usage, because you actually want to force the database to use the index, even if it actually must scan 99% of the table. You want it not to scan the page that booms.

If you can get SELECT to give you most of the data, you can then do something such as:

CREATE TABLE my_table_2 AS 
SELECT * FROM my_table WHERE my_primary_key < value;

and later on:

ALTER TABLE my_table RENAME TO my_table_old ;
ALTER TABLE my_table_2 RENAME TO my_table ;

... and you'll have all the data that could be retrieved. I wouldn't drop the old table, in case someone finds later on a better method of retrieving the missing info.

Best of luck.

Related Solutions

Postgresql – Error: Could not read Block X of relation base/Y/Z

Before doing anything else, read and act on: http://wiki.postgresql.org/wiki/Corruption .

Most likely you have disk or file system problems.

If you suspect any kind of DB corruption for whatever reason you should stop the DB and copy the entire database at the file system level before attempting any recovery.

Once you've done that, then you can look into possible repairs. You'll probably have some significant data loss, so your goal should be to get it working to the point where you can pg_dump the damaged databases, re-initdb, and reload.

If you have a recent backup, now would be a good time to think about using it.

Postgresql – pg_restore: [archiver (db)] could not execute query: ERROR: schema “public” already exists

The error is harmless but to get rid of it, I think you need to break this restore into two commands, as in:

dropdb -U postgres mydb && \
 pg_restore --create --dbname=postgres --username=postgres pg_backup.dump

The --clean option in pg_restore doesn't look like much but actually raises non-trivial problems.

For versions up to 9.1

The combination of --create and --clean in pg_restore options used to be an error in older PG versions (up to 9.1). There is indeed some contradiction between (quoting the 9.1 manpage):

--clean Clean (drop) database objects before recreating them

and

--create Create the database before restoring into it.

Because what's the point of cleaning inside a brand-new database?

Starting from version 9.2

The combination is now accepted and the doc says this (quoting the 9.3 manpage):

--clean Clean (drop) database objects before recreating them. (This might generate some harmless error messages, if any objects were not present in the destination database.)

--create Create the database before restoring into it. If --clean is also specified, drop and recreate the target database before connecting to it.

Now having both together leads to this kind of sequence during your restore:

DROP DATABASE mydb;
...
CREATE DATABASE mydb WITH TEMPLATE = template0... [other options]
...
CREATE SCHEMA public;
...
CREATE TABLE...

There is no DROP for each individual object, only a DROP DATABASE at the beginning. If not using --create this would be the opposite.

Anyway this sequence raises the error of public schema already existing because creating mydb from template0 has imported it already (which is normal, it's the point of a template database).

I'm not sure why this case is not handled automatically by pg_restore. Maybe this would cause undesirable side-effects when an admin decides to customize template0 and/or change the purpose of public, even if we're not supposed to do that.

Best Answer

Related Solutions

Postgresql – Error: Could not read Block X of relation base/Y/Z

Postgresql – pg_restore: [archiver (db)] could not execute query: ERROR: schema “public” already exists

Related Question