Some insights from having actually built a couple of apps with contacts and companies.
Firstly, you're missing several use-cases in your outline. Among the colorful ones I've run into over the years which you don't necessarily cover:
- Some organizations have sub-organizations, be them divisions, subsidiaries, you name it.
- Some people belong to several organizations. As in VP whatever of XYZ, chairman of ABC, and self-employed CEO/consultant -- each on a part time basis, hopefully.
- Some people can be contacts for an organization without actually belonging to it. E.g. when a high profile consultant is temporarily hired as a project lead, he could possibly become the primary contact for company XYZ without actually belonging to it.
- People can have sub-people too. And the latter can be in the same companies, or not. For instance, our VP of XYZ might prefer that you go through his main primary secretary over there, unless you contact him regarding ABC in which case that would be his assistant; or personally for consulting work -- a very busy person indeed, but there are people like that.
- Some organizations have no contacts associated with them. This can happen, for instance, in an app that outputs lists prospect companies (for sales people) in which contacts have yet to be identified.
- Conversely, contacts can have no organizations. For instance consumers.
Secondly, since your outline mentions customers, be wary that customers can be a company or a (physical) person. You seem to have that part covered based on your diagram, so let's move on.
If your next step is to add anything related to accounting, such as an "orders" table, be wary that anything accounting related is tied to a company, a contact, products, prices, etc. at a given point in time. These details can evolve in all sorts of colorful ways, and a frequent design mistake is to create the perfect normalized design and assume that you'll just update the tables as needed. Big no-no. If the taxman asks you to print out your invoices, and your IT says company XYZ at price P when accounting booked company ABC at price Q, you're very, very screwed. And don't get me started on archived/closed yearly accounts and reports that change due to similar design mistakes.
Thirdly, be very, very, very wary of UI/UX issues that you might introduce by normalizing too much. If it does not work like the apps that typical users have in their hands (read: Outlook), you'll be a poorly trained secretary away from changing the company of every employee at XYZ to ABC when Joe gets a new job. I've actually seen this happen and it was not a pretty sight. (She kept her job, in case you're wondering.)
Lastly, don't get me started on merging the inevitable duplicate companies, contacts, and so many other colorful things that couldn't possible turn up. Keep those in mind and make sure that your schema is very, very forgiving because it will happen.
Now... in practice...
Personally, I've come to actually turn a blind eye on normalization in this case. Contacts/Companies is one of those cases where DB design and beautifully normalized data as taught in school is a recipe for trouble in the form of resource hogging, over-complicated queries and ludicrously complex UI. It is, imho anyway, not the right thing to do.
Start by creating a contacts table, complete with fields such as first and last name, display_name if needed, company_name, address, phone, cell_phone, email, email2, secretary_name, secretary_phone, etc. If it's available when you create a contact in Outlook, it probably belongs in there.
You'll note that I didn't mention a companies table. That's because you usually don't want any strong ties between your contacts table and your companies table. If you need one, add one, and add a copmany_id in addition to the company_name. But then, make it a foreign key on delete set null. And be sure to keep it very, very loosely tied to company_name at the database level. Maintain it at the front-end level -- not anywhere else. This will keep a secretary from inadvertently changing the company of multiple contacts.
Keep things sane. Anything that might reasonably show up in a list (i.e. select * from contacts where ... limit 10), be queried against, or be useful frequently ought to be in the table. No joins, nada. Query, iterate through the result, done.
If you've really want additional garbage in there, you've two options.
One is to create an extra_contact_details table. It can be an EAV table, or the full load of company_name, address, phone, etc. fields, or a hogwash of normalized bits and pieces. Whichever option you take, don't over do it. Either will create additional (potentially complex) queries, and heaps of thorny programming issues down the road when you create the UI. What's absolutely key here is, if you go this route, a secretary who has always worked with Outlook needs to make sense of it.
The other, which I'll actually recommend upfront, is to add a text field called "extra_contact_details" in the contacts table and begone with it. No normalization at all. One set of primary details. Frequently used secondary details. Anything extra as plain text. Secretary gets it. End users get it. Done.
Lastly, if you need to store a version of any data at a given point in time, be sure to duplicate the value of any key details -- read: whatever you need to print it exactly as it was when it got inserted.
Best Answer
OK - so in your first year, all going well, you might be looking at 50GB for 1k organisations? That's certainly not "pie in the sky". What I'm going to propose is based on my own experiences and other people's opinions may vary depending on theirs. I worked for a company that had fewer organisations but more data, however I think that what worked there may very well apply to your use case.
We used Oracle at the time, and the company, being scrupulous about licences, didn't want to run loads of Oracle servers with attendant costs. So, every client had their own schema (or in MySQL terms, database) on the same server. That is what I am proposing for your situation.
The advantage this has is that if you have to take one client down, you won't affect the others because their data will be independent of each other. Neither will you have to worry about one client seeing another client's data. Furthermore, given that this is a new project, I imagine that you'll want a good deal of flexibility until the app is "bedded down".
This arrangement also gives you the possibility of running different clients on different versions - say your early adapter types on a test release and your laggards on another. You might want to to test on Freemium type clients, but keep your paying customers happy with the stable version?
You can also consolidate some of your data in a central reference schema (i.e. lookup tables - names of countries/US states/post codes...) and grant select on those to all users of the other schemas. You could do something like what's shown here.
Yet another advantage is that when you grow, all you'd have to do is fire up another machine and split your customer base between the two machines (and so on up to ...). Although, I hope you'll have split long before you arrive where this chap did in terms of schema numbers! :-)
The only major downside of this approach as far as I can see is that it will complicate scripting - but then that should be no problem to a man of your calibre (10K on Stackoverflow :-) ). Just to make sure I wasn't committing some appalling IT faux pas, I used your friend and mine, Mr. Google and was pleasantly surprised to come up with these links (1 - note the point about standards, 2 & 3 - see the "Shared schema" part of the accepted answer). There are dissenting voices, but on the whole, they appear to show that I'm not completely delerious! Furthermore, as I said, this is based on my own personal experiences. You may well, of course, choose to consolidate when your app is up, and running stably.
Two final points.
I believe that I've given my take on this part of your question
but I haven't answered the bit as to which tables should have which foreign keys. This is virtually (again, IMHO) impossible to answer for the following reason. I've started many projects with all sorts of ideas but then when you start throwing data at it, all sorts of new issues arise - it's the classic "no plan survives contact with the enemy". Experiment, play around, test. If you come up with particular issues, then post back here, but it would be rash to give advice at this point.
Finally, and this is again (totally) IMHO. If I were starting a project such as this from scratch, I would, without hesitation, choose PostgreSQL over MySQL. How MySQL came to dominate the Open Source database world is a different question. MySQL doesn't have check constraints. It doesn't fully support set operators. It doesn't have CTEs (Common Table Expressions). It doesn't support windowing functions. Just my 0.2c...