I have a nested table hierarchy, similar to the following one (this is an example but demonstrates the schema well enough):
+---------+ +---------+ +---------+ +-------------+
| users | | blogs | | posts | | comments |
+---------+ +---------+ +---------+ +-------------+
| user_id | | blog_id | | post_id | | comment_id |
| email | | user_id | | blog_id | | post_id |
+---------+ | name | | content | | comment |
+---------+ +---------+ +-------------+
So, the object hierarchy is clear: a comment belongs to a single post, which belongs to a single blog, which belongs to a single user.
Now, in my app code, I want to verify that comments can only be read by the user who owns the blog which owns the post (assume these are private comments, ok?). E.g., when I SELECT * FROM comments WHERE comment_id = 666
, I want to verify that the requesting user "owns" that comment.
My dillema is this: Would you advice that I:
- duplicate the
user_id
tocomments
, so I always have a quick reference to the owning user, and checking the user id (during the select or later) is straightforward, or - Write my DB access code (a mapper or what have you) to always JOIN and fetch the user_id field as part of the returned object. Mind you that in this case I need two JOINs to do that
Basically I see it as performance vs. the so called bad practice of data duplication – but is it really such a bad practice in this case? It is unlikely that user_id
values will ever change or that a blog
changes its user_id
.
I am using MySQL (MariaDB in fact) if that makes any difference.
Best Answer
You should go with second option obviously for many reasons:
Will you add user id again if you need to come up with an entity at same level of comment? May be it won't happen in your scenario, but still.
Storage, as your database size grows
There won't be performance impact much as your query is highly selective and if you have good index design.