COPY is documented as not being WAL-logged if it's loaded on the same transactions..
In minimal level, WAL-logging of some bulk operations can be safely skipped, which can make those operations much faster (see Section 14.4.7). Operations in which this optimization can be applied include: CREATE TABLE AS,
CREATE INDEX,
CLUSTER,
COPY into tables that were created or truncated in the same transaction
With that said, do any of the non-durability features speed up a pg_restore -1
?
-1 –single-transaction Execute the restore as a single transaction (that is, wrap the emitted commands in BEGIN/COMMIT). This ensures that either all the commands complete successfully, or no changes are applied. This option implies –exit-on-error.
COPY
is a single transaction, willsynchronous_commit
do anything? Or does that only speed up batch-INSERTS
?- Does making the initial table
UNLOGGED
matter ifCOPY
ignores the WAL log anyway? - What about
max_wal_size
andcheckpoint_timeout
? I assume they would do nothing without WAL-logging? - To what extent does turning off
fsync
do anything other than playing with fire? Are we likely to see a speed up by turning it off?
Best Answer
The answer you refer to is talking about bulkloading a DB, not restoring a dump.
synchronous_commit
will have minimal to no effect onCOPY
, since there's only one commit at the end.COPY
can only reduce WAL, not ignore it, and only ifwal_level = minimal
. Most production systems run withwal_level = hot_standby
(in pg10,wal_level = replica
) or higher.UNLOGGED
tables absolutely do benefitCOPY
at higherwal_level
, but only if you do not later plan on making the table logged. If you do, you're just deferring the WAL cost, since it'll be paid when you convert theUNLOGGED
table toLOGGED
instead of atCOPY
time. UseUNLOGGED
tables if you can regenerate and reload the data later in case of crash and you don't need it to replicate.pg_restore
will restore unlogged tables as unlogged, and logged tables as logged, so it's kind of a moot point.Don't turn
fsync
off unless you really know what you're doing, or you don't care about your data. For bulk loading a new blank PostgreSQL instance,wal_level = minimal
andsynchronous_commit = off
are probably going to be just as good. Better, even, since you avoid a whole bunch of WAL writing.max_wal_size
won't make much difference with bulk loading ifwal_level = minimal
. If you're using a higher WAL level you do need to tune checkpoints for best results.checkpoint_timeout
doesn't really matter since it generally only comes into play on idle-ish systems.So if you're using
pg_restore
to a system configured for physical replication (hot standby/warm standby/archive PITR) you've got to pay the costs for that, and there's not much you can do to speed things up on top of whatCOPY
already does.pg_restore
usesCOPY
.