Amazon RDS for PostgreSQL – How Amazon Creates RDS for Postgres

amazon-rdspostgresqlschema

I am really curious about the infrastructure of Amazon. When I create a new instance of RDS, it gives me a host, username, password.. Everything. Behind the curtains, how do they do this? It´s some infrastructure like Docker to run multiple instances of PostgreSQL? How do I replicate this?

My problem is: I have many users and I would like to let then manage their own tables. I thought that it could be done with schemas, but they still can see the other users tables, and I dont want that. Any sugestion?

Best Answer

Amazon doesn't talk about this much and the servers are intentionally locked down, so it's hard to be completely sure.

They're EC2 instances that run a custom AMI and have automation tools - in-house, or something like Puppet/Chef/etc. These automation tools communicate with the AWS control panel over web service APIs, SSH push access, etc, and are responsible for managing the PostgreSQL configuration, starting/stopping/reloading the server, etc.

Each EC2 instance runs a single PostgreSQL database server, with its own users, roles, etc.

It's basically just a sealed AWS EC2 instance that you don't have much access to, you just get a locked down non-superuser PostgreSQL connection. Nothing magic.

This isn't the only way to do it. Heroku used to use OpenVZ on top of EC2 to partition EC2 instances into smaller containers, for example. I think these days they always have one EC2 instance per database though.

It sounds like what you want is multi-tenant hosting. You have many options for this:

  • One server per user with a single PostgreSQL instance on each server (EC2 or Heroku style)
  • one PostgreSQL instance per user on a single host server;
  • one database per user on a single PostgreSQL instance;
  • one schema per user in a single PostgreSQL database;
  • a single set of tables with your application limiting access to data within the tables based on enforced WHERE clauses or row-level security policies.

Which to choose depends on trade-offs involving isolation of users, performance, and cost.

There aren't currently any convenient canned recipes to do this that I know of, but searching for "multi-tenant postgresql" will help you find more information.