Postgresql – Moving a Postgres database from standalone local DB to Amazon RDS

postgresql

I have a working Ruby on Rails 4 app running locally on my Mac. I'm running Postgres locally.

I want to move the database (and only the database, not the whole app) to Amazon RDS Postgres.

What is the easiest way to setup all of the tables on the new database and migrate the data over to it?

Best Answer

@joni correctly points out that pg_dump is what you want. However, I recommend not doing a simple SQL dump. Instead, use a PostgreSQL "custom" format dump that can be fed into pg_restore. This allows parallel dump/restore (faster) if you use directory format, but it's mostly useful because it offers control over the details of the restore without taking a new dump.

Custom-format dump/restore

For example, assuming you've already done a CREATE DATABASE on the destination so there's an empty DB named mydb there:

# dump the local db
pg_dump -Fc -o mydb.dump mydb

# restore to Amazon RDS
pg_restore -h my.amazon.host -U rds_superuser -d mydb --no-owner --no-privileges --no-tablespaces

Roles, tablespaces, foreign servers, GRANTs, etc

Note that roles (users and groups) are not included in the dump. Neither are foreign data wrapper servers or some other global objects. Thus, things that depend on them will fail to restore. In the above I omit restoration of the owner and of privileges for that reason - meaning you'll need to redo your GRANTs and REVOKEs if you have non-default privileges in your database.

Similarly, all objects are restored to the default tablespace, because any tablespaces you may have locally won't be on RDS.

No remote access to rds_superuser?

If you don't allow your rds_superuser to log in remotely, but have done a GRANT rds_superuser TO myusername for some other role, then instead of -U rds_superuser above you can use:

-U myusername --role=rds_superuser

to get pg_restore to authenticate as myusername then SET ROLE rds_superuser;.

Replication

Most production PostgreSQL database migrations are performed using replication and failover. This isn't currently possible with Amazon RDS. RDS does WAL archiving and streaming replication internally, but offers no way to get at the WAL archives or to create a replication entry in pg_hba.conf for replication for clients outside RDS.

For that reason, at time of writing you have to dump and reload to move data into or out of RDS. There's no low-downtime option.