PostgreSQL – Change AWS Read Replica’s Streaming Replication to Trigger-Based Replication

amazon-rdsawspostgresqlpostgresql-9.3replication

We have a Postgres RDS instance for which we have created a read replica instance. The problem we are facing is that, that whenever there is a long running query(about 30-40 mins) on Read Replica, it doesn't complete giving a conflict error. The reason for this conflict is that there was an update on the main RDS, which was getting reflected to Read Replica as it is following streaming replication approach.
For this very reason and also due to new requirement, we need to make this replication process daily, i.e. instead of continous updation we need a trigger based approach(something like a daily backup approach).

I saw in the AWS's console,there are various parameters we can set. Is there some parameter which I can set to achieve my requirement? Or the only way out for me is to delete the read replica instance and use a trigger based replication tool like Bucardo and use an EC2 instance for this?

I would really prefer the approach through which I could tweak some parameter and achieve my objective without having to delete the Read Replica instance.

Best Answer

The problem we are facing is that, that whenever there is a long running query(about 30-40 mins) on Read Replica, it doesn't complete giving a conflict error.

This behavior is controlled by the parameters max_standby_streaming_delay / max_standby_archive_delay. You can fiddle with these parameters in the RDS Parameter Group used by your Read Replica instance to allow more time for queries against your Read Replica to complete.

For this very reason and also due to new requirement, we need to make this replication process daily, i.e. instead of continous updation we need a trigger based approach(something like a daily backup approach).

If you'd like a snapshot of your primary database refreshed nightly, you could do this with a cron job restoring RDS snapshots every night. I don't think RDS has a button to do this automatically for you, but it shouldn't be too hard to script up a nightly create-db-snapshot + restore-db-instance-from-db-snapshot using the AWS CLI, or boto, or whatever interface to AWS you like. You could even maintain a Route53 entry which would always point to the most-recent instance, and leave the old instances lingering for a day or so before being killed off, so that sessions running against existing instances overnight wouldn't be interrupted.

I saw in the AWS's console,there are various parameters we can set. Is there some parameter which I can set to achieve my requirement? Or the only way out for me is to delete the read replica instance and use a trigger based replication tool like Bucardo and use an EC2 instance for this?

Supposedly it is possible to hook up Bucardo to RDS now that RDS Postgres supports the session replication role, but if you want a nightly snapshot I think you'll be much better off using RDS instance snapshots.