Database Design – Should a Table Have Multiple Foreign Keys?

database-design

That was a hard title to digest even for me so I'll break it down.

I'm very new to SQL, and I just designed my first database. I also am not very good at JOINS which might be the answer to my problem. But to get back on track I currently have three tables of interest:

comments

   id    |      author       |     category    | date_submitted 
---------+-------------------+-----------------+----------------

replies

   id    | date_submitted | comment_id 
---------+----------------+------------

images

           image_url            | reply_id |                    original_id                    
--------------------------------+----------+---------------------------------------------------

So given this, the question came to mind when I was trying get a table with the columns comments.id and images.original_id, an operation that I would like to do often.

Currently to do this I can't directly JOIN comments and images because they have nothing in common, meaning that the middleman has to be the replies table. So I have to JOIN replies and images and then JOIN that with comments.

So my question is this. Would it be okay to add a foreign key column referencing comments to the images table so that I can directly JOIN the two tables instead of going images -> replies -> comments, or would it be bad practice to add a foreign key that I already have to access to through the replies table?

Best Answer

This answer expands on the comment given by Pieter Geerkens.

The definition of third normal form states that it rules out what are called transitive dependencies. The description given of 3NF in Wikipedia (see article) has an example of a transitive dependency.

Your description of the foreign keys in your Q strongly suggests, but does not absolutely prove, a transitive dependency. The only way you can truly get the dependencies in the subject matter is to study the subject matter.

So, assuming that your example does have a transitive dependency, then your Q boils down to "what advantage do you get by conforming to third normal form"?

The answer to this is that violations of 3NF result in harmful redundancy. Harmful redundancy is a situation where the same assertion is stored multiple times in different places in the database (in this case multiple rows of the table in question). If your design permits harmful redundancy, it will also permit mutually contradictory assertions, which you will have to avoid by careful programming.

It's better, all other things being equal, to conform to 3NF and prevent the contradictions that way.

All other things are seldom equal. There are reasons for adopting a design that doesn't conform to 3NF, in spite of the risk. One design that comes to mind is star schema, but don't worry about star schema until you've got 3NF under your belt.