My website has different content types. All content types should be tag-able (except if specified in the model, whatever). I can think of 2 options for my design:
2 tables option : 1 table for the tags (+one other, its i18n content) plus 1 table for the association (content_type , content_id, tag_id).
Pros :
- Creation is faster
- model parts for the tag is the same (so inherited model)
- same thing for controlers.
Potential Cons :
- Will the performance be poor?
- will the table be too large soon ?
- How to query for tags clouds (related to all grouped contents)?
many tables options :
1 table for tags, an association table per type of content.
Pros :
- Tables will be shorter
Cons :
- Have to customize each model or write a logic/footprint to query the right table
- ugly long request / php code to join tags of all content types
Additional Facts :
- Front side will have cache, tags are set by content writers.
- There are 10 content types.
- There won't often be new types of content.
- Content count is ~500 , will ~double in 3 years.
- MySQL engine InnoDB.
- built with yii framework.
- will query tags for a defined content type
- will query tags for any type of content (together, one big tag cloud for example).
So my idea was to use the first design (2 tables). But I'm not a db expert …
Would you recommand the second option ? Why ?
Best Answer
I would go for option 1 (two tables).
If you only plan to have 1000 content items (and in 3 years, not now), I don't think you will have many performance problems unless each content item is tagged with a dozen tags where most are unique to that item. And even then I don't think you'll see much of a performance problems. It also simplifies your structure if you have 1 tag table for every content type and then you decide to add/remove/change content types later.
If you're really not sure, do you have the time/resources to build two sample DBs, one based on each structure, populate them with what you think is a realistic load (you can write a script/program to generate & insert test data, right? with "boring" tags and content such as "content_1..content_2000 and tag_1... tag_5000) and actually test which performs better?