I'm currently running RedShift with pushes of all data from our production Postgresql databases every 10 minutes or so. Periodically, the ETL process from Postgresql to Redshift gets delayed or backed up. Is there any way to monitor the lag time between RedShift and Postgresql?
Get the Replication/Lag time of Redshift
redshiftreplication
Related Question
- Dimensional Modeling and ETL in Redshift
- Median Absolute Deviation for Time Series Outlier Detection in Amazon Redshift
- Postgresql – Why does pg_last_xact_replay_timestamp() sometimes return extremely old timestamps
- How to join sparse events for individual users on a “filled in” date ranges in Redshift
- Explicit locks in RedShift
- PostgreSQL – How to Prevent Read Replica Restarts During High Replication Lag
Best Answer
In this context, I usually create a specific table to see the last replication status.
On your
PostgreSql
master server, you create a simple table, example:CREATE TABLE check_replication ( now_date timestamp );
Then you create a batch or job, that performs every minute, or more often:
INSERT INTO check_replication ( now_date ) VALUES (now());
Then, you add this table in the replicated tables from
PostgreSql
toRedShift
And you can check the replication delay.
HTH, Thomas