Mysql – Maximum normalization

MySQLnormalizationPHP

I have been studying the normal forms rules to guide me with database design and data storage, the problem I am having is that once I optimize for the first normal form (1NF), I figured I may well just go the whole way(even though not mandatory).

However a occurrence showed in my design that I haven't came across in examples online or books.

A example concept.

We want to store all data about the some car variations.

We must store values make, series, model, start year, end year.

Unormalized

+------+--------+-------+------------+----------+
| make | series | model | start year | end year |
+------+--------+-------+------------+----------+
| BMW  |   5    |  E12  |    1972    |   1981   |
| BMW  |   5    |  E28  |    1981    |   1988   |
| BMW  |   5    |  E34  |    1988    |   1996   |
| BMW  |   5    |  E39  |    1995    |   2004   |
+------+--------+-------+------------+----------+

Normalized

car_make

+----+------+
| id | make | 
+----+------+
| 1  | BMW  | 
+----+------+

car_series

+----+--------+---------+
| id | series | make_id |
+----+--------+---------+
| 1  |   1    |    1    |
| 2  |   3    |    1    |
| 3  |   5    |    1    |
| 4  |   7    |    1    |
+----+--------+---------+

car_model

+----+-------+------------+----------+-------- --+
| id | model | start year | end year | series_id |
+----+-------+------------+----------+-----------+
| 1  |  E12  |    1972    |   1981   |     3     |
| 2  |  E28  |    1981    |   1988   |     3     | 
| 3  |  E34  |    1988    |   1996   |     3     | 
| 4  |  E39  |    1995    |   2004   |     3     |  
+----+-------+------------+----------+-----------+

One problem arises here in the last table, should I also include a column make_id in the car_model table?

It is beneficial as I would not have to join car_series, then select make_id, to select make by id, however I believe this may not be sticking to normalization, because I would be repeating data(even though it is a id its still repeat).

Can this design be normalized further?

On another note with regards to the start year, end year handling, How would you professionals do this? I am curios if there is any range tricks to enforce integrity(although the E39 overlaps, lets speak as though no dates overlap).

Thanks in advance for any thoughts.

Best Answer

No, you should not include a column make_id in the car_model table, it is implicitly defined by the series_id. If you need to see the make details you could make a view that looks like your un-normalized table.

No this design cannot be normalised further.

To force the year ranges to be non-overlapping you could add a trigger (as MySql does not support check constraints) to ensure that start_year >= max(end_year) and end_year >= start_year.

The trigger would look something like this:

create trigger trg_car_model_date_range_unique before insert on car_model
for each row
begin
    if new.start_year < (select max(end_year) from car_model) then
        signal sqlstate '45000' set message_text = 'Date range is not unique';
    end if;
    if new.end_year < new.start_year then
        signal sqlstate '45000' set message_text = 'Date range is invalid';
    end if;
end
Related Question