I'm wondering: If I start doing a pg_dump of a very large database (it would take hours), that is still running actively receiving writes, what is then the last data that goes into the backup? Is it:
- The data as-it-was, at the point in time where the pg_dump command was initiated.
- The last changes that it encountered at some point where it was dumping that individual record.
- Something else.
Bonus question: If I'm trying to dump a database as part of trying to rescue it from corrupt data, will it then make any difference whether I use the directory
format or the custom
format?
Best Answer
It's the data at the start of the command for an entire database. According to the manpage:
and in SQL Dump:
Dumping in parallel (
--jobs
) may be problematic with changing data, but only when targeting less recent versions:I don't think the output format makes any difference in the rescue operation. Note that for a parallel dump,
directory
is the only possible format.