That was a hard title to digest even for me so I'll break it down.
I'm very new to SQL, and I just designed my first database. I also am not very good at JOINS
which might be the answer to my problem. But to get back on track I currently have three tables of interest:
comments
id | author | category | date_submitted
---------+-------------------+-----------------+----------------
replies
id | date_submitted | comment_id
---------+----------------+------------
images
image_url | reply_id | original_id
--------------------------------+----------+---------------------------------------------------
So given this, the question came to mind when I was trying get a table with the columns comments.id
and images.original_id
, an operation that I would like to do often.
Currently to do this I can't directly JOIN
comments
and images
because they have nothing in common, meaning that the middleman has to be the replies
table. So I have to JOIN
replies
and images
and then JOIN
that with comments
.
So my question is this. Would it be okay to add a foreign key column referencing comments
to the images
table so that I can directly JOIN
the two tables instead of going images -> replies -> comments, or would it be bad practice to add a foreign key that I already have to access to through the replies
table?
Best Answer
This answer expands on the comment given by Pieter Geerkens.
The definition of third normal form states that it rules out what are called transitive dependencies. The description given of 3NF in Wikipedia (see article) has an example of a transitive dependency.
Your description of the foreign keys in your Q strongly suggests, but does not absolutely prove, a transitive dependency. The only way you can truly get the dependencies in the subject matter is to study the subject matter.
So, assuming that your example does have a transitive dependency, then your Q boils down to "what advantage do you get by conforming to third normal form"?
The answer to this is that violations of 3NF result in harmful redundancy. Harmful redundancy is a situation where the same assertion is stored multiple times in different places in the database (in this case multiple rows of the table in question). If your design permits harmful redundancy, it will also permit mutually contradictory assertions, which you will have to avoid by careful programming.
It's better, all other things being equal, to conform to 3NF and prevent the contradictions that way.
All other things are seldom equal. There are reasons for adopting a design that doesn't conform to 3NF, in spite of the risk. One design that comes to mind is star schema, but don't worry about star schema until you've got 3NF under your belt.