Say a news website has news categories such as Politics, Business, Society, Education, Health, Tech, Celebrity, Lifestyle, … etc. The number of categories may be more than 20. New categories may be added; existing categories may be deleted.
If a registered member has no desire to see the news of a certain categories, he/she should be able to block those news so that those news don't show up in the news list on the homepage. This will only affect himself/herself, of course. Other members have their own preferences.
Each member can visit the Manage News Categories page to block/unblock news categories. The page will contain a HTML <form>
with <input type="checkbox">
s of news categories.
What's the most performant database design for such a feature?
I'm currently not an expert of database design. My initial thought is creating an associative table called "member_category_map" and associate it with the "members" table and the "categories" table.
Not sure if I'm being naive, but I'm worrying that the "member_category_map" table might end up containing too much data. Assuming that there are 20 categories and each member only blocks 1 category. If there are 2,000,000 members, then the "member_category_map" would contain 2,000,000 * 19 = 38,000,000 data. I don't know if that's good for performance.
Advices are much appreciated. Thanks.
Best Answer
You need four tables:
Table
Pivot
is standing for M:N relation and contains the subscription pairs user:category. Each user can have 0..n categories subscribed and each category can be subscribed by 0..m users. To fetch articles accordingly to the user's subscriptions you need to useJOIN
s like that:For the best performance you need to index tables by the columns used by the query:
user
should have the primary Key (id)article
should have the multicolumn key (id, cat_id, date)pivot
should have the multicolumn primary key (user_id, cat_id) with FKeys to the corresponding tables.