Postgresql – pg_restore: [custom archiver] could not read from input file: end of file

pg-dumppg-restorepostgresql

Not going into details how that happened, all started with a corrupt DB. It was OK on the file-system level, but trying to:

select count(*) from mytable

resulted in the following:

ERROR: could not read block 257798 in file "pg_tblspc/16386/PG_9.6_201608131/16385/16506.1": read only 0 of 8192 bytes

The table is very large (~200GB) and contains some bytea columns with binary data. It is never updated, only inserted into and selected from.
I run pg_dump on it, it was working for a long time, produced a ~200GB file, but failed in the end with a similar error:

pg_dump -Z9 -Fc -d mydatabase -t mytable -v -f /datadir/mytable.backup

pg_dump: Dumping the contents of table "mytable" failed: PQgetResult() failed.

pg_dump: Error message from server: ERROR: could not read block 257798 in file "pg_tblspc/16386/PG_9.6_201608131/16385/16506.1": read only 0 of 8192 bytes

pg_dump: The command was: COPY public.mytable (id, account_id, fetched, col1, col2, col3, col4, col5) TO stdout;

Then I tried to restore it, again it was working a long time, in the end it took ~200GB space on that tablespace, however select count(*) from mytable returns 0.

 pg_restore -Fc -d mydatabase -v /datadir/mytable.backup

pg_restore: processing data for table "public.mytable"

and after a long time

pg_restore: could not read from input file: end of file

On server side this resulted in:

ERROR: canceling statement due to user request

CONTEXT: COPY mytable, line 14497030

STATEMENT: COPY mytable (id, account_id, fetched, col1, col2, col3, col4, col5) FROM stdin

LOG: could not send data to client: Connection reset by peer

FATAL: connection to client lost

It looks like the pg_restore requested to abort the whole operation.

I would like to recover as many rows as possible, I suspect that all the data should be still there since it's basically immutable, and that just the last record is corrupted.

Is there a way to force Postgres to keep the records which were restored up till the failed one?

Best Answer

I would suggest you try to do something along the line of:

SELECT count(*) FROM mytable WHERE my_primary_key < value ;

... and find at which point it fails. [This may be tedious, but you can just cut in half repeatedly, until you find the value that first fails.]

You will most probably need to alter the different settings that affect index usage, because you actually want to force the database to use the index, even if it actually must scan 99% of the table. You want it not to scan the page that booms.

If you can get SELECT to give you most of the data, you can then do something such as:

CREATE TABLE my_table_2 AS 
SELECT * FROM my_table WHERE my_primary_key < value;

and later on:

ALTER TABLE my_table RENAME TO my_table_old ;
ALTER TABLE my_table_2 RENAME TO my_table ;

... and you'll have all the data that could be retrieved. I wouldn't drop the old table, in case someone finds later on a better method of retrieving the missing info.

Best of luck.