Planning simple CMS: How much relation

central-management-serverdatabase-designdatabase-theorydatabase-tuningoptimization

I'm currently in the process of designing the database for a rather minimalistic Content Management System I'm working on alone. It should allow scalability while remaining speed, especially on boxes with low-end hardware.

As I am doing database design this carefully for the first time I am struggling more than a bit and could use some help. I've made two models for the database, which both fail to satisfy me:

  • Model 1: Everything dynamic is an entity (i.E. menu items, blog posts, projects):

    Model 1

    I want to achieve maximal scalability by using the same base for every content type, but one of my examples already fails: the menu. Not all menu item's might have a URL (dropdown-item's for example) and some might be separator, which would be impossible to achieve with this.

  • Model 2: Posted dynamic content are entities (i.E. blog posts, projects, comments):

    Model 2

    Not as scalable as Model 1, but most likely faster. Still, contents won't need a preview (which is shown as description on the landing page) as they're only shown on their parent's pages.

The Project-class is just an example class for content-type-specific metadata, in this case the info for a button that could lead to a github repository, an image gallery and so on.
I've got the feeling I'm completely failing at this, I also fear that this will result in hundreds of queries when trying to get comments on something (checking for parent's on each one for correct nesting etc.).

I'll be glad for everyone getting me an advice on how to approach this better, a tip on how to view this from a different angle or anything else that could point me in the right direction.

Best Answer

Once you start stepping into areas like hierarchies, you probably want to know your RDBMS to some extent. If your RDBMS supports recursive queries natively your hierarchies will look very different then if each iteration is a round trip to the db.

This being said there are a few warnings that come to my mind looking at your schema. I believe these are likely to impact both performance and data integrity.

  1. The separate parents table strikes me as somewhat dangerous. What are the rules for parents and children? Can you have blog -> menu_item -> blog -> comment -> menu_item? This first creates the possibility of problems that may be difficult to track down, but also consider that if the rules are based on smaller tables, actually enforcing the rules becomes possible If your menu items are broken off, a caching layer becomes a possibility. Same with other things that are less frequently changed, and the amount of effort to generate the menu on the part of the db will be reduced.

  2. I actually think your second model, if things like menu items, etc. are broken off is likely to be both more scalable and faster. I would still suggest breaking apart comments and blog posts, and tracking the relationships there more closely. If you are worried about scalability and your site has 10000 blog posts and 1000000 comments, being able to filter the comments also by the attached blog post is likely to mean you can get to use better indexes at least on some db's.

I am reminded of the phrase "premature optimization is the root of all evil." Design a good db schema first and then worry about speed and scalability.