MySQL – Problems with Creating a Database per Customer

database-designdatabase-recommendationMySQL

I remember from the stackoverflow podcasts that Fog Creek use a database per customer for Fogbugz. I assume that means the Fogbugz On Demand servers have 10s of thousands of databases.

We are just starting to develop a web app and have a similar problem to solve (lots of customers with their own isolated data).

What problems should I expect with using a database per customer? How can I solve them?

My Initial Thoughts

Advantages of a database per customer

  • Simpler database schema
  • Simpler backups – you can backup each customer in turn without it really impacting on other customers.
  • Makes it easy to export a given customers data.
  • Better cache performance – a write to one of the more active tables only impacts that single customer that performed the write.
  • Easier to scale across hardware. For example, when we need to go from 1 to 2 servers, we just move half our customers to the new server.

Disadvantages

  • Can MySQL cope with 5,000 databases? Would performance suck?
  • Changes to the schema can be hard to replicate out across all the databases. We would really really have to have an automated plan for this, such as versioning the schema and a script that understands how to take a database from one version to another.
  • Doing anything that is common to all our customers might be awkward or impossible
  • Similar to above, but any analytics we want to perform across all our customers might be impossible. How should we track usage across all customers for example?

Best Answer

This solution is called a multi-tenant design where each tenant (customer) has their own database. Given that, there are some other considerations to the alternative approach which is a single database:

  1. With a single database, everyone must be on the same version no matter what. It isn't possible to upgrade some customers and not others. This can be problematic if a customer wants a hotfix of an application that isn't ready for wide release.
  2. With a single database, when you do an upgrade, every client is down. If something goes wrong, every client is screwed.
  3. With a single database, it is much more difficult to throttle resources. I.e., if one client is hammering the database, it is harder to give them more resources separate from everyone else.
  4. It is much more difficult to allow users to host their own versions of your application. If you are building a solution that will be used by large enterprises, this is often a non-starter. Their IT department wants complete control over access to the system.
  5. It is probably cheaper to scale out databases rather than scale them up. I.e., having to invest in faster hardware to host one database to rule them all is probably more expensive than being able to scale customers out to smaller, less expensive database servers. I can't say this one definitively because it depends greatly on the server software. If you stick with MySQL, this is probably true because the licensing costs are negligible. However, if you move up to SQL Server for example, scaling out becomes much more expensive unless you use a VPS environment and the cost-benefit of scaling up vs. scaling out changes. I can say, however, that once your database gets very large, management requires ever-greater levels of expertise. Very large databases require playing around with multiple filegroups and pushing certain indexes to different spindles to get better performance. In short, they get can complicated very quickly.

Having separate databases does mean you have to build an update mechanism which matches the database version with the application/site version. However, separate databases do provide superior isolation of data and IMO have a lower cost of hosting. It isn't a solution for all scenarios. If your system were never going to be hosted outside of your hosting and needed to scale up in customers rapidly and having all users on the same version of the application and database schema was desirable, then certainly having a single database is a better approach.

Related Question