Postgresql 9.1 replication delay

postgresql

I just set up a PostgreSQL 9.1.2 database server on debian squeeze.
I configured a standby server (hot standby) on a second identical server (squeeze, PG 9.1.2).

On the master I set these parameters (postgresql.conf) :

wal_level = hot_standby
archive_mode = on
archive_command = 'test ! -f /mnt/pg_replication/myserver/%f && cp %p /mnt/pg_replication/myserver/%f'
archive_timeout = 30
max_wal_senders = 3
wal_keep_segments = 100

On the slave, I set the following parameters (postgresql.conf) :

hot_standby = on

On the slave (recovery.conf) :

standby_mode = 'on'
primary_conninfo = 'host=master_ip port=5432 user=repluser password=xxxx'
restore_command = 'pg_standby -d -s 2 -r 30 -t /tmp/pgsql.trigger.5442 /mnt/pg_replication/myserver/ %f %p %r 2>>/var/log/standby.log'
recovery_end_command = 'rm -f /tmp/pgsql.trigger.5442'
archive_cleanup_command = '/usr/lib/postgresql/9.1/bin/pg_archivecleanup /mnt/pg_replication/myserver/ %r'

This works well but I noticed that the replication delay can take up do 1 minute.
Is there a way to reduce the delay for changes to apply on the standby to a few seconds ? I reduced archive_timeout to 30 seconds but I don't know if this is the right way to make the replication faster…

Another question : how can I monitor the replication delay between the two servers ? I tried the following query on the master but it returns 0 rows… 😕

postgres=# SELECT * from pg_stat_replication;
(Aucune ligne)

Did I missed something ?

Thank you,
Nicolas

EDIT 2012-01-03 : I changed my recovery.conf to this and it is now working well :

restore_command = 'cp /mnt/pg_replication/myserver/%f %p'
standby_mode = on
primary_conninfo = 'host=master_ip port=5432 user=repluser password=xxxx
application_name=mycluster'
trigger_file = '/tmp/trig_mycluster'

Best Answer

The fact that you have no lines in pg_stat_replication on the master indicates that your streaming replication is not working - and it has instead fallen back on the file based replication, which has more delay.

Check your logs on both master and slave, they should tell you why it's not starting up in streaming mode.

Also, your recovery.conf file is incorrect - you should not be using pg_standby together with standby_mode. That's a leftover from pre-9.0 days.