Help wrapping the mind around many-to-many relationships

relational-theory

I'm making a database that needs to have the following tables: Article, Video, Image, and Keyword. Keyword contains a fixed list of keywords. (For example when creating an article, you'd have to select from a list of keywords, not create a new one) Each Article, Image, and Video can have n number of keywords. I'm having a hard time wrapping my mind around how I'd define these relationships in my database, as I've really only dealt with one-to-one and one-to-many in the applications I've had to develop in the past. Would I do a many-to-many between Keyword and the other tables? Or would I need to make a join table between each table? KeywordImage (keyword_id, image_id)? Or are neither of those options the right way to think about this?

If a join table is the right approach, would it be one-to-many between Article and ArticleKeyword?

edit
Keyword isn't a required field in any of the tables

Best Answer

Something like:

Article                ArticleKeyword         Keyword
---------------        ---------------        ------------
article_id (PK)  <---  article_id (FK)        
                       keyword_id (FK)  --->  keyword_id (PK)

There could be zero-to-many ArticleKeyword rows for each row in Article.

There could be zero-to-many ArticleKeyword rows for each row in Keyword.

This equates to a many-to-many relationship between that Article and Keyword tables.

The primary key for ArticleKeyword would be the composite key (article_id, keyword_id) unless you wanted to count multiple occurrences of the same keyword in the same article separately (perhaps noting the position in the article) in which case you may instead want a surrogate key like so:

Article                ArticleKeyword         Keyword
---------------        ---------------        ------------
                       ak_id      (PK)        
article_id (PK)  <---  article_id (FK)        
                       keyword_id (FK)  --->  keyword_id (PK)
                       kw_location

(or you could add the location properties to the combined key, of course, if those extra columns would never be NULL)