PostgreSQL pg_dump ERROR – Missing Chunk Number 0 for Toast Value

postgresql

I'm using PostgreSQL 8.4.15. While running pg_dump to backup a database, I got the following error:

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  missing chunk number 0 for toast value 123456789 in pg_toast_987654321
pg_dump: The command was: COPY public.my_table (id, .... all the columns ...)

When searching for this error message, I found a couple of references (here and here) that suggested to reindex the table. (In these discussions, there was a reference to querying the pg_class table to find the right pg_toast_XXXXXX value, but it seemed that it was because it wasn't displayed in their error messages. I skipped this part because I had a value displayed in the error message. I guess it might be a convenience due to a later version of PostgreSQL.)

I ran the following:

REINDEX table pg_toast.pg_toast_987654321;
VACUUM ANALYZE my_table;

I'm now able to use pg_dump without errors.

What's pg_toast and what did these commands actually do? Are these merely about simple cleanup or could they have got rid of some rows in that table? What could have caused the problem in the first place?

There are about 300000 rows in this table, but I would expect there to be only about 250 new rows since the previous successful backup (this table is only used for INSERT/SELECT, no UPDATEs).

Best Answer

Given that what you did was a reindex, what probably happened was it used an index scan to try to locate the toasted values in the table and couldnt find one. This sounds like a corrupted index. Vacuum analyse does alter the table but reindex does not and the changes are very minor.

The way to think about this is that TOASTed attributes are actually broken into chunks of about 4k in size and these are stored in rows. They are looked up and sorted/reconnected with the main row at query time. It sounds like an index used here was corrupted and so the reindex solved the problem.

I have found corrupted indexes are usually a sign that something is not well with the server. It is good to check and make sure memory, CPU's and hard drives are all happy and not reporting problems. I have found overheating servers to be particularly prone to index corruption and if indexes can get corrupt one has to worry about data becoming corrupt too.