Web application for Software as a Service – Shared database or Multiple Databases

database-designdeployment

(I am not sure if this is the best place to post this question. If not, please redirect me to the appropriate stack exchange site.)

I am looking to deploy a web application for multiple clients. The idea is that each client can only access his or hers own data. Let's say that the client is the owner of the application installation. Each installation can have multiple users. All installations will be in my own hosting, so that I can better control the code and business model (its a SaaS kind of thing).

I am looking for technical options to perform this kind of deployment.

[Option 1] – Shared single database

On easy option to do this would be:

  • change the database in order for each relevant table to know its
    "owner id".
  • change all SQL queries so that each User will only fetch
    data that is related to its "owner id" (I don't have the requirement
    for an User to be common to more than one client/installation).

However, this solution has a few drawbacks:

  • requires changing most of the SQL code;
  • there might be some performance impact since the database will have much more data (the data for all installations).
  • mixing data of different clients in the same database may cause some issues (if there is a bug, its easier for someone to end up accessing some else's data);

So I would prefer an alternative where the data would be isolated.

[Option 2] – Multiple databases

One alternative that I was considering was:

  • For each installation, buy a DB from the hosting company
  • Come up with a configuration scheme that allows associating each installation with the database where the connection should be performed.
  • For example, if the user accesses http://very-nice-app.com/client-3/ the code knows that it must connect to database-name-for-client-3.

This solves the potential issue of data leaking, but requires me to maintain several databases (if DB tables change, it has to be done across multiple installations). This can probably be automated somehow.

[Questions]

What would be the better approach? Is there a better alternative than the ones I have indicated?

Thanks in advance for your help.

[Additional info]

  • The database engine will probably be MySQL.
  • This is a startup, so we are trying the business model. We don't have a good idea of how many clients we will have – it will be zero for some time. Right now we are looking for the ability to deploy and scale for a few clients (lets say up to 10).
  • This app is targeted at small business and the amount of data should not be in millions of records. The largest table should have about 10.000 records per year (per client). This value is based on the company with whom we did the prototype (which is a mid sized company for the area x business).

Best Answer

This is the old "how to handle multi-tenancy" question for which there is no one right answer. You have covered a good selection of the key points. I can't provide an answer beyond "it depends" but here are a couple of other points for you to consider:

  • mixing data of different clients ... if there is a bug, its easier for someone to end up accessing some else's data and Come up with a scheme associating each installation [correct] database - the risks here are very similar if the various databases are on or can be accessed by the same hosts, especially from the point of view of a defending against deliberate attacker, so that part of the decision comes down to which collection of potential security concerns you feel you can efficiently design around most effectively. If you feel your SQL skills are lacking that might push you in the multiple database direction due to concerns about large data sizes become inefficient as your client base grows.

  • For each installation, buy a DB from the hosting company for a SaaS solution you have an inappropriate hosting arrangement if you are paying per database or similar. It sounds like traditional shared web hosting were you should be looking at dedicated machines (virtual or physical) or a PaaS based platform.

  • Consider carefully how you intend the application to scale. The multiple databases option gives you the freedom to split clients between database servers as your needs grow, but the single database option is easier to manage in a PaaS environment if you design is appropriate, letting the platform provider worry about the physical reality of scaling as you pay for more resources.

  • Also on the scalability matter, make sure consider your administrative responsibilities: how will you backup the data, restore parts of it if there is an error/accident, or restore all of it in the event of an application/platform wide catastrophe, and what timescales do you want/need to work to in such events?

For more information see https://msdn.microsoft.com/en-us/library/aa479086.aspx, https://en.wikipedia.org/wiki/Multitenancy, and search for similar terms for further discussion. (that first link is from MS, but covers the matter from a general design perspective rather than being specific to MS SQL Server, it has also been around a while but the matters concerned haven't changed as much as you might think the would have in the last decade)