We have a single master/single streaming replica Postgres 9.3 db on AWS. The load is not terribly high – this is a development/staging environment. (The production shows similar metrics). Point is the "ReplicaLag" shown in Cloudwatch oscillates wildly during the day between 0 and 200 seconds. I've changed max_wal_senders from 5 to 10 with no change.
Any suggestions for diagnosing this?
(This is a t2.small master and t2.small replica, however the production instance is large, and exhibits the same issue. CPU is < 2%, connections < 60, iops only about 5-15 count/second).
Best Answer
I had the same issue with our staging RDS. After quite a while I found a blog post that says:
To test this I wrote a simply script which updates a random record every 0.5s and the lag went to 0.
From the AWS docs under Read Replica Limitations with PostgreSQL -