Postgresql – Is it safe to initially rsync without pg_start_backup, provided you snapshot and rsync afterwards

postgresqlreplication

Trying to think of ways to speed up slave creation without master downtime.

The normal way of doing is:

  1. pg_start_backup
  2. rsync database files
  3. pg_stop_backup
  4. rsync wal files and start slave

Since 2 can take a long time over the network, is it safe to:

  1. rsync
  2. pg_start_backup
  3. rsync again
  4. pg_stop_backup
  5. rsync wal files and start slave

?

Best Answer

Yes, completely safe. The subsequent rsync performed under snapshot conditions will ensure that the data is properly aligned using the delta method.

One might want to ensure that there isn't an accumulation of irrelevant data files by using the --delete option (think carefully about what you are excluding) but this is a mostly separate issue.