Second Normal Form (2NF) Normalization Exercise

database-designnormalization

I'm using a past exam paper for practice, can someone help me normalise this database to 2NF? I have uploaded a picture of the original below, and also my attempt at normalising it myself. I am certain I have gone wrong with the YEAR column, but can't be sure.

Original

Attempt

Best Answer

You have the right idea and have just gone a little wrong with the year column.

Background

The table given shows that Company Id + Year is the key to the table - meaning that each row must have a unique combination of the value of these two columns. This is indicated by the solid line under these two columns. A table is normalized when that table is a faithful representation of a relation. To approximate a relation, the table must:

  1. Have unique unordered rows
  2. Have uniquely labeled unordered columns
  3. Have a single value of whatever domain the column is defined upon in cell

In looking at the table in the exercise these conditions are met. Once a table is normalized, it can be further normalized as a way to eliminate certain redundancies which occur due to functional dependencies between the columns. By functional dependency we simply mean that the value of one column always determines the value of the second. This is based on the mathematics of algebraic functions. For example, the value 2, when plugged into the algebraic function 2x + 3 will always yield 7. 2NF is defined to mean that each non-key column in the table is fully dependent on the entire key - thus no partial key functional dependencies.

The Exercise

When inspecting the example table we can easily see Company Nameis dependent only on Company Id and not on Year. Each time for example we see a value of 295 for Company Id we see a value of Miracle Holidays. The repair is exactly as you have done - split Company Id and Company Name out into their own table where there can be a single unique row for each Company Id with the corresponding Company Name. Now Company Name is fully dependent on the key - Company Id. Year simply stays in the original table and there is no need to redundantly split it out into a third table.

A Caution

The exercise is asking you to infer the functional dependency of Company Name on Company Id based solely on inspection of the data. While in this simple example it is obvious, in the real world you cannot simply assume a functional dependency exists based on the existing data. What appears to be a functional dependency might turn out to instead be coincidental. Because of this, in the real world you would always ask the expert in the domain of knowledge the table represents what the functional dependency should be.

Getting More Information

Normalization is a very complex topic and I have glossed over many important concepts. CJ Date has written an entire book with respect to it called Database Design and Relational Theory: Normal Forms and all that Jazz. While definitive, it is hard to grasp all the formalisms. An excellent reference that presents the formalisms in language more easily understood by common practitioners is Fabian Pascal's Practical Database Foundation Series. Studying both of these references - first Fabian's and then Date's - will give you all the information you need to master normalization as a repair procedure.