Hello I am creating a normalized Twitter DB for a project I am working on and wanted a bit of guidance on foreign keys.
I am curious about what the difference would be if for hashtags, symbols, user_mentions, urls and media, I made another field in each table called tweet_ID and the relationship changes to the primary key id in the tweet table to each tables tweet_id? The way I've modelled it means the tweets table has 5 extra fields for each tweet object.
Currently I have the schema designed as follows:
Are there glaring issues with the way I have implemented the foreign keys in the model this way?
Thank you!
Best Answer
So, for each tweet, you can have no more than one hashtag? That doesn't sound right to me.
For
tweets
andhashtags
(and probably for the other four connected tables), I would create a "bridge" table,tweet_hashtag
. This table would consist of:tweet_id
(foreign key to thetweets
table)hashtag_id
(foreign key to thehashtags
table)created_on
date/time column could be appropriate)tweet_id
andhashtag_id
should be the primary key for this table, as you presumably should only note a given hashtag appears in a given tweet once. You can create a surrogate key (probably just an auto-increment integer or bigint value) if you really want to, or if you do need to be able to have multiple entries of the same hashtag for a single tweet for some reason)This allows a given hashtag to be associated with many tweets, and a given tweet to have multiple (or no) hashtags.
As noted above, the same probably applies to the other four fields associated with the
tweets
table (user_mentions
,urls
,media
,symbols
).