Feasibility of Managing Thousands of Users in PostgreSQL

connection-poolingperformancepostgresqlusers

We are creating SAAS where we will at most have 50.000 customers. We are considering creating a user in the Postgres database for each customer. We will map each user that logs into our service to a user in the database in order to be very sure that they only have access to their own data. We also want to implement an audit trail directly in the database by this solutions, which utilizes triggers. If each customer has its own database user, then it would be very easy to see who did what, even if two customers would share the same data.

Will we be running into some unexpected problems because we have 50.000 users in our database? Performance-wise or administration-wise. Maybe connection pooling would be more difficult, but I do not really know whether we would need it.

Best Answer

Yes, it should be fine. You should use connection pooling though, as pg uses a fair amount of memory per connection (about 10MB AFAIK).

More than 500 simultaneous connections per box will be a problem though (like actively querying the database at the exact same time). More cpus/cores is better. Use SSDs with RAID 10.

Your SaaS application should connect as one user, then set role to the real user. This allows you to use connection pooling, as the connection string will be the same, but use different users. You should reset role when returning connection to the pool.

This is not really database authentication. It's proxy authentication (aka Impersonation).

You could also consider separate pools per company or per role.

To make admin easier, you can put users into groups and set permissions via groups. This is called RBAC.

Update: I was able to create 50,000 users in 2.4 seconds. PGAdmin is noticeably slower, due to the number of users. However connecting via JDBC is as fast as before. I was unable to drop 50,000 users at once, but could do about 10,000 at a time.