Mysql – Schema for multi-account database

database-designMySQLschema

I'm building an app which will have accounts for organisations leading to organisation.myapp.com subdomains for their account access.

I'm putting together the schema for it and I have a few questions.

This is a simplified overview.

  • An organisation has users
  • An organisation has clients
  • A client has contacts
  • A client has projects
  • A project has an estimate

Some initial thoughts

  • The clients table and the users tables will have organisation_id as a
    foreign key.
  • Contacts and Projects will have client_id as a foreign key
  • Projects will have estimate_id as a foreign key

However, is there an argument that organisation_id should also be a foreign key in Estimates, Contacts and Projects?.

For example, to make sure that only members of the relevant organisation can view Estimates I need to perform a query each time to make sure that the authorised user belongs to the organisation that owns the estimate. I can do this via a join but I wonder if this is unnecessary overhead. On the other hand, repeating organisation_id across pretty much every table seems to go against denormalization.

Is there a best practice I should be following here?

If it's of any importance this is for a MySQL DB with all the tables running InnoDB.

Thanks.

Best Answer

OK - so in your first year, all going well, you might be looking at 50GB for 1k organisations? That's certainly not "pie in the sky". What I'm going to propose is based on my own experiences and other people's opinions may vary depending on theirs. I worked for a company that had fewer organisations but more data, however I think that what worked there may very well apply to your use case.

We used Oracle at the time, and the company, being scrupulous about licences, didn't want to run loads of Oracle servers with attendant costs. So, every client had their own schema (or in MySQL terms, database) on the same server. That is what I am proposing for your situation.

The advantage this has is that if you have to take one client down, you won't affect the others because their data will be independent of each other. Neither will you have to worry about one client seeing another client's data. Furthermore, given that this is a new project, I imagine that you'll want a good deal of flexibility until the app is "bedded down".

This arrangement also gives you the possibility of running different clients on different versions - say your early adapter types on a test release and your laggards on another. You might want to to test on Freemium type clients, but keep your paying customers happy with the stable version?

You can also consolidate some of your data in a central reference schema (i.e. lookup tables - names of countries/US states/post codes...) and grant select on those to all users of the other schemas. You could do something like what's shown here.

Yet another advantage is that when you grow, all you'd have to do is fire up another machine and split your customer base between the two machines (and so on up to ...). Although, I hope you'll have split long before you arrive where this chap did in terms of schema numbers! :-)

The only major downside of this approach as far as I can see is that it will complicate scripting - but then that should be no problem to a man of your calibre (10K on Stackoverflow :-) ). Just to make sure I wasn't committing some appalling IT faux pas, I used your friend and mine, Mr. Google and was pleasantly surprised to come up with these links (1 - note the point about standards, 2 & 3 - see the "Shared schema" part of the accepted answer). There are dissenting voices, but on the whole, they appear to show that I'm not completely delerious! Furthermore, as I said, this is based on my own personal experiences. You may well, of course, choose to consolidate when your app is up, and running stably.

Two final points.

I believe that I've given my take on this part of your question

For example, to make sure that only members of the relevant organisation can view Estimates I need to perform a query each time to make sure that the authorised user belongs to the organisation that owns the estimate. I can do this via a join but I wonder if this is unnecessary overhead.

but I haven't answered the bit as to which tables should have which foreign keys. This is virtually (again, IMHO) impossible to answer for the following reason. I've started many projects with all sorts of ideas but then when you start throwing data at it, all sorts of new issues arise - it's the classic "no plan survives contact with the enemy". Experiment, play around, test. If you come up with particular issues, then post back here, but it would be rash to give advice at this point.

Finally, and this is again (totally) IMHO. If I were starting a project such as this from scratch, I would, without hesitation, choose PostgreSQL over MySQL. How MySQL came to dominate the Open Source database world is a different question. MySQL doesn't have check constraints. It doesn't fully support set operators. It doesn't have CTEs (Common Table Expressions). It doesn't support windowing functions. Just my 0.2c...