Pros and cons of a particular DB design approach for single vs multiple tables

cjavaPHP

I just got a new project (in JAVA) where I am thinking of all the pros and cons of a particular DB design approach. Some pointers:

  1. Our app has many "content" entities like News, Blogs, CMS pages, Wikis, Suggestions etc. I feel that most of these entities are similar (having 75% same fields) but 25% are different. For example, blogs might have a few extra columns which News does not have.

  2. Since every entity seems "similar", I am thinking of keeping them in the same DB table with all the columns covering each entity.

  3. Data load: A customer might have 100K news, 50K blogs, 10K CMS pages, 100K wikis and 50K Suggestions. This is the top-end (maximum) a customer might have given our product. Some customers might have just 2K-5K items of each type in their DB. Lower end data load: 1K items of each type in DB.

  4. There will be more "reads" than "inserts" in this app (customer usage).

My questions:

  1. Will having all this data in a single table hurt performance too much? Note that we will be using Lucene for searching.

  2. Having a single content table with an entity type column which specifies the entity type, we can have a single business layer handling these entities. This will reduce the code base considerably. Any cons of this approach?

Best Answer

Note, that I assume C# (since this language is present in tags) and Entity Framework.

Both approaches assumes, that entity types are inherited from abstract, non-persistent base entity type, which accumulates 75% of fields, so the code base discussion is not relevant.

Pros of Table-Per-Hierarchy (all descendants are stored in single table):

  • database structure is simpler (just one table)
  • queries are simpler (with Table-Per-Type you need joins to select descendant's row)
  • auditing is simpler

Cons of Table-Per-Hierarchy:

  • the table has low-density data (a number of nullable fields, filled dependent on row type). This isn't a problem, when there are 5-6 such fields, but what if there are 20-25 of them?
  • table relations and indexes structure are not obvious. Since your entities will have relations with other entities, this will lead you to foreign key madness. Just imagine foreign keys for you table.
  • database structure isn't obvious. Who can guess, that news are stored beside CMS pages?
  • usually, 90% of entity types have an Id and Name-like fields (Title, Header, etc). Are you going to store all of them, using single table?
  • table size will grow very fast. With the lapse of time some operations (e.g., deletion) will be slower.

IMHO, every inheritance scheme, mapped to a relational database, should be used very carefully. Definitely you shouldn't map all you entities to a single table.