Mysql – Creating a normalised Twitter database and foreign keys

database-designMySQL

Hello I am creating a normalized Twitter DB for a project I am working on and wanted a bit of guidance on foreign keys.

I am curious about what the difference would be if for hashtags, symbols, user_mentions, urls and media, I made another field in each table called tweet_ID and the relationship changes to the primary key id in the tweet table to each tables tweet_id? The way I've modelled it means the tweets table has 5 extra fields for each tweet object.

Currently I have the schema designed as follows:
Schema 1

Are there glaring issues with the way I have implemented the foreign keys in the model this way?

Thank you!

Best Answer

So, for each tweet, you can have no more than one hashtag? That doesn't sound right to me.

For tweets and hashtags (and probably for the other four connected tables), I would create a "bridge" table, tweet_hashtag. This table would consist of:

  • tweet_id (foreign key to the tweets table)
  • hashtag_id (foreign key to the hashtagstable)
  • if appropriate, other columns that would apply only to the bridge table, if any (I can't think of any in this case, as I assume a tweet is created at a single moment in time; if tweets could be edited, then a created_on date/time column could be appropriate)
  • (Note: the combination of tweet_id and hashtag_id should be the primary key for this table, as you presumably should only note a given hashtag appears in a given tweet once. You can create a surrogate key (probably just an auto-increment integer or bigint value) if you really want to, or if you do need to be able to have multiple entries of the same hashtag for a single tweet for some reason)

This allows a given hashtag to be associated with many tweets, and a given tweet to have multiple (or no) hashtags.

As noted above, the same probably applies to the other four fields associated with the tweets table (user_mentions, urls, media, symbols).