Mysql – Nested table hierarchy: foreign key duplication vs. join

duplicationforeign keyhierarchyMySQLperformance

I have a nested table hierarchy, similar to the following one (this is an example but demonstrates the schema well enough):

+---------+  +---------+  +---------+  +-------------+
| users   |  | blogs   |  | posts   |  | comments    |
+---------+  +---------+  +---------+  +-------------+
| user_id |  | blog_id |  | post_id |  | comment_id  |
| email   |  | user_id |  | blog_id |  | post_id     |
+---------+  | name    |  | content |  | comment     |
             +---------+  +---------+  +-------------+

So, the object hierarchy is clear: a comment belongs to a single post, which belongs to a single blog, which belongs to a single user.

Now, in my app code, I want to verify that comments can only be read by the user who owns the blog which owns the post (assume these are private comments, ok?). E.g., when I SELECT * FROM comments WHERE comment_id = 666, I want to verify that the requesting user "owns" that comment.

My dillema is this: Would you advice that I:

  • duplicate the user_id to comments, so I always have a quick reference to the owning user, and checking the user id (during the select or later) is straightforward, or
  • Write my DB access code (a mapper or what have you) to always JOIN and fetch the user_id field as part of the returned object. Mind you that in this case I need two JOINs to do that

Basically I see it as performance vs. the so called bad practice of data duplication – but is it really such a bad practice in this case? It is unlikely that user_id values will ever change or that a blog changes its user_id.

I am using MySQL (MariaDB in fact) if that makes any difference.

Best Answer

You should go with second option obviously for many reasons:

  • Will you add user id again if you need to come up with an entity at same level of comment? May be it won't happen in your scenario, but still.

  • Storage, as your database size grows

  • There won't be performance impact much as your query is highly selective and if you have good index design.