Postgresql – pg_repack slows down PostgreSQL replication

postgresqlpostgresql-9.5replicationrepmgr

I have a master PostgreSQL 9.5 server and a standby server. For replication I use repmgr (WAL streaming). Typically the delay between master and standby is <5s:

$ psql -t -c "SELECT extract(epoch from now() - pg_last_xact_replay_timestamp());"
  0.044554

Periodically pg_repack is invoked on master in order to optimize indexes and tables. Repacking tables causes massive changes in WAL streaming and significantly slows down replication, so that the difference between master and standby could be more than 1h.

Is there a way how to reduce such delay? Is it possible to synchronize newly incoming data with higher priority than repack changes?

Best Answer

I was about to ask a similar question, but it is related to the replication streaming ("physical"/block related) that streams/replicates the actual data writes to the disk(s). With vacuum full (reindexes), and truncates/restores, and now pg_repack, the tables are rewritten to disk, causing a lot of data writes that need to be streamed to the other side...

Thus, no, I don't believe you'll be able to do the "prioritization" as the moment the rebuilt table are "swapped" into active table, the new updates/writes on the master, will be going to the rebuild table, not the old table, and then the replica needs that table "available"!

I've been getting into the habit of killing the replication, doing the major data changes (perhaps a good practise to have it as a "backup" available before the data changes) and then doing a new full pg_basebackup/replication restarts

Hope this helps to explain the situation you are in and how I've been solving it till now :)

That said do go read: https://www.depesz.com/2013/06/21/bloat-removal-by-tuples-moving/

Depesz explains a mechanism that helped him move data to the beginning of the table "on the fly" with data available all the time using code similar to:

with x as (
delete from test where id in (999997,999998,999999) returning *
)
insert into test
    select * from x;

this is then run in batches, with $vacuum$ statements running together to clean up the space. Doing this in a slow/managed method, you could be able to do the "repack" with not too far behind replicas.

Just be careful of triggers on update/insert/delete !