I'm working on an SQL database of useful plants. I'd like to be able to allow users to annotate the plants with a set of hierarchical categories and tags. For example:
- category: medicine
- body part: stomach
- disease: IBS
An initial database structure could be:
Plant
id | name | category |
---|---|---|
1 | ginger | medicine |
PlantTags
id | plant_id | tag_id |
---|---|---|
1 | 1 | 1 |
2 | 1 | 2 |
Tags
id | type | value |
---|---|---|
1 | body part | stomach |
2 | disease | IBS |
I would like to restrict the categories and tags that can be used together. For example, one should only be able to apply the body part
and disease
tags if the medicine
category is selected. Furthermore, one should only be able to select the disease: IBS
tag if the body part: stomach
tag is selected.
Put another way, I want a way to specify which combinations of tags are valid.
Is this possible at the database level?
EDIT:
Users should also be able to categorise a plant without tagging it. For example:
Plant
id | name | category |
---|---|---|
2 | cedar | timber |
EDIT 2:
Here is an example of a set of tags that should not be allowed:
id | type | value |
---|---|---|
1 | body part | stomach |
2 | disease | pneumonia |
pneumonia
is a disease of the lungs, not the stomach, so having the two tags stomach
and pneumonia
should not be allowed.
Best Answer
There are a couple of ways you can go about this, but the most simplest way to enforce this would probably be to create a dimension combination table. Such a table is pre-filled with all valid possible combinations (by
id
) and then that is the source of your application's controls (such as dropdowns, etc) because as you filter on one dimension in that table, only valid entries for the remaining dimensions will be available.For example, your dimension combinations table would have a
plant_category_id
column and atag_type_id
and maybe a secondtag_type_id2
column (or if you normalized yourTags
table, the column naming would be a little better in the dimensions combinations table).The other thing you can do is implement a complex check constraint (possibly with a function) but this is probably more work than it's worth, and will depend on which implementation of MySQL and version you're using.
The third way is handling the logic in stored procedures that manage data access via the logic you want to implement, but probably would also be a lot of work, especially to maintain over time. So the first suggestion that makes it table driven would be my first pick.