Postgresql – How to selectively replicate a PostgreSQL database

postgresql

I would like to keep two PostgreSQL databases with the same schema, but have one replicated from a master that only keeps a small portion of the live data.

Basically the master takes in live data and deletes old data outside of the time window, while the replicated slave should keep all the historical data (at least up to the limits of the disk). So in essence, the slave should replicate the master except its deletes.

Is this possible with any "off-the-shelf" PostgreSQL replication method?

Best Answer

Postgres built-in replication is based on log shipping, and as far as I know does not support this (you're shipping the Postgres transaction log from the master to the slaves, so everything - including deletes - gets passed along. You may be able to approximate it by keeping all the transaction logs (forever) so you can roll back to a specific point in time, but that's not a particularly efficient technique.

You can probably rig up something like this with trigger-based replication mechanisms like Slony, but a better solution might be a standard mirror plus the table auditing (tablelog) project which basically creates a historical log of what happened to a table (in a shadow schema) so you can see what was done when, and by whom.
This is best combined with 9.x-style master/slave replication, where you can read from the slaves to poke around in the audit history.