Update: I've posted about this to the AWS forums - please go chime in and ask for it there.
At time of writing, Amazon RDS does not support physical replication outside RDS. You can GRANT
users the REPLICATION
right using an rds_superuser
login, but you can't configure replication
entries for outside IPs in pg_hba.conf
.
Furthermore, when you create a DB parameter group in RDS, some key parameters are shown but locked, e.g. archive_command
, which is locked to /etc/rds/dbbin/pgscripts/rds_wal_archive %p
. AWS RDS for PostgreSQL does not appear to expose these WALs for external access (say, via S3) as it would need to if you were to use WAL-shipping replication for external PITR.
So at this point, if you want wal-shipping, don't use RDS. It's a canned easy-to-use database, but easy-to-use often means that it's also limited, and that's certainly the case here. As Joe Love points out in the comments, it provides WAL shipping and PITR within RDS, but you can't get access to the WAL to that from outside RDS.
So you need to use RDS's own backup facilities - dumps, snapshots and its own WAL-based PITR.
Even if RDS did let you make replication connections (for pg_basebackup
or streaming replication) and allowed you to access archived WAL, you might not be able to actually consume that WAL. RDS runs a patched PostgreSQL, though nobody knows how heavily patched or whether it significantly alters the on-disk format. It also runs on the architecture selected by Amazon, which is probably x64 Linux, but not easily determined. Since PostgreSQL's on disk format and replication are architecture dependent, you could only replicate to hosts with the same architecture as that used by Amazon RDS, and only if your PostgreSQL build was compatible with theirs.
Among other things this means that you don't have any easy way to migrate away from RDS. You'd have to stop all writes to the database for long enough to take a pg_dump
, restore it, and get the new DB running. The usual tricks with replication and failover, with rsync, etc, won't work because you don't have direct access to the DB host.
Even if RDS ran an unpatched PostgreSQL Amazon probably wouldn't want to permit you to do WAL streaming into RDS or import into RDS using pg_basebackup
for security reasons. PostgreSQL treats the data directory as trusted content, and if you've crafted any clever 'LANGUAGE c' functions that hook internal functionality or done anything else tricky you might be able to exploit the server to get greater access than you're supposed to have. So Amazon aren't going to permit inbound WAL anytime soon.
They could support outbound WAL sending, but the above issues with format compatibility, freedom to make changes, etc still apply.
Instead you should use a tool like Londiste or Bucardo.
@joni correctly points out that pg_dump
is what you want. However, I recommend not doing a simple SQL dump. Instead, use a PostgreSQL "custom" format dump that can be fed into pg_restore
. This allows parallel dump/restore (faster) if you use directory format, but it's mostly useful because it offers control over the details of the restore without taking a new dump.
Custom-format dump/restore
For example, assuming you've already done a CREATE DATABASE
on the destination so there's an empty DB named mydb
there:
# dump the local db
pg_dump -Fc -o mydb.dump mydb
# restore to Amazon RDS
pg_restore -h my.amazon.host -U rds_superuser -d mydb --no-owner --no-privileges --no-tablespaces
Roles, tablespaces, foreign servers, GRANT
s, etc
Note that roles (users and groups) are not included in the dump. Neither are foreign data wrapper servers or some other global objects. Thus, things that depend on them will fail to restore. In the above I omit restoration of the owner and of privileges for that reason - meaning you'll need to redo your GRANT
s and REVOKE
s if you have non-default privileges in your database.
Similarly, all objects are restored to the default tablespace, because any tablespaces you may have locally won't be on RDS.
No remote access to rds_superuser
?
If you don't allow your rds_superuser
to log in remotely, but have done a GRANT rds_superuser TO myusername
for some other role, then instead of -U rds_superuser
above you can use:
-U myusername --role=rds_superuser
to get pg_restore
to authenticate as myusername
then SET ROLE rds_superuser;
.
Replication
Most production PostgreSQL database migrations are performed using replication and failover. This isn't currently possible with Amazon RDS. RDS does WAL archiving and streaming replication internally, but offers no way to get at the WAL archives or to create a replication
entry in pg_hba.conf
for replication for clients outside RDS.
For that reason, at time of writing you have to dump and reload to move data into or out of RDS. There's no low-downtime option.
Best Answer
I don't believe that PostgreSQL dropped support for any Linux distribution. Perhaps you mean that there are no binary installation packages provided.
Using a 9.6 client is no problem, since 9.6 is a supported release. You will not be able to use new features like scram-sha-256 authentication, but I guess you can live without that.
Building PostgreSQL from source would be another option.