I have been studying the normal forms rules to guide me with database design and data storage, the problem I am having is that once I optimize for the first normal form (1NF), I figured I may well just go the whole way(even though not mandatory).
However a occurrence showed in my design that I haven't came across in examples online or books.
A example concept.
We want to store all data about the some car variations.
We must store values make, series, model, start year, end year.
Unormalized
+------+--------+-------+------------+----------+
| make | series | model | start year | end year |
+------+--------+-------+------------+----------+
| BMW | 5 | E12 | 1972 | 1981 |
| BMW | 5 | E28 | 1981 | 1988 |
| BMW | 5 | E34 | 1988 | 1996 |
| BMW | 5 | E39 | 1995 | 2004 |
+------+--------+-------+------------+----------+
Normalized
car_make
+----+------+
| id | make |
+----+------+
| 1 | BMW |
+----+------+
car_series
+----+--------+---------+
| id | series | make_id |
+----+--------+---------+
| 1 | 1 | 1 |
| 2 | 3 | 1 |
| 3 | 5 | 1 |
| 4 | 7 | 1 |
+----+--------+---------+
car_model
+----+-------+------------+----------+-------- --+
| id | model | start year | end year | series_id |
+----+-------+------------+----------+-----------+
| 1 | E12 | 1972 | 1981 | 3 |
| 2 | E28 | 1981 | 1988 | 3 |
| 3 | E34 | 1988 | 1996 | 3 |
| 4 | E39 | 1995 | 2004 | 3 |
+----+-------+------------+----------+-----------+
One problem arises here in the last table, should I also include a column make_id in the car_model table?
It is beneficial as I would not have to join car_series, then select make_id, to select make by id, however I believe this may not be sticking to normalization, because I would be repeating data(even though it is a id its still repeat).
Can this design be normalized further?
On another note with regards to the start year, end year handling, How would you professionals do this? I am curios if there is any range tricks to enforce integrity(although the E39 overlaps, lets speak as though no dates overlap).
Thanks in advance for any thoughts.
Best Answer
No, you should not include a column make_id in the car_model table, it is implicitly defined by the series_id. If you need to see the make details you could make a view that looks like your un-normalized table.
No this design cannot be normalised further.
To force the year ranges to be non-overlapping you could add a trigger (as MySql does not support check constraints) to ensure that
start_year >= max(end_year)
andend_year >= start_year
.The trigger would look something like this: