Database Design – Decomposition of a Relation to 2NF and 3NF

database-designdatabase-theorynormalization

I am trying to solve this example, but for now I can't. Can somebody show me, step by step how to decompose this to 2NF and then to 3NF?

R = {ABCDEF, ABC -> EF, B -> E, E -> D}

key of this relation is:

{ABC}

So now I try to split R into 2 relations:

R0 = {ABCEF, ABC -> EF}

R1 = {BED, B -> E, E -> D}

but its all wrong… i got other answers in my book. I dont even try to make 3NF out of it.

Best Answer

The goal of 3NF (and BCNF) is to avoid situations when a set of attributes is functionally determined by something that is not a key (or, more formally, a superkey), which may lead to anomalies and redundancy.

Keeping that in mind, we can describe certain functional dependencies as either "bad" (if they prevent the relation from being in 3NF) or "good" (if they do not).

So your solution is wrong, because "your" R1 has the E->D dependency and E is not R1's key. Determining a key for a relation is another topic which, I suppose, is beyond the scope of the task you need to accomplish (we are told that ABC is a key). But if you are interested, you can look the method up using "attribute closure" keywords.

Anyway, the E->D dependency is "bad". But that is not the only problem. Another one is that your decomposition is not a "lossless-join" one. In other words, if we have some real world data, split the relation according to your proposition and try to join R0 and R1 back, the data will be corrupted. Data corruption is not what normalization is designed for.

We have to solve 2 problems:

  1. Decide which dependencies are "bad" (the definition of normal forms is the answer to this)
  2. Decide how to split relations to prevent data corruption (we have Heath's theorem to help us)

Getting back to your problem, finally.

In R, neither B, nor E is the key. So both B->E and E->D are "bad", and we need to take measures. On the contrary ABC->EF is "good", because ABC is a key.

Currently R is in 1NF, because B is a part of a key and functionally determines E, which is not a part of a key. This kind of situation is unacceptable in 2NF.

E->D dependency is not that "bad", because E is not a part of a key.

Another thing to mention at this point is that a set of dependencies {B->E, E->D} implies the dependency B->ED (according to so called Armstrong axiom of transitivity).

So we decide to split using B->ED dependency as "the worst" one. That's the way we do this, according to Heath:

R = ABCDEF
R0 = BED (B->E, E->D, B is a key)
R1 = ABCF (ABC->F, ABC is a key)

Now we have R1, which is in 3NF and R0, which is in 2NF (it has a transitive dependency B->E->D, which is unacceptable in 3NF).

We need to split R0 and to do so we have to choose a "victim" dependency among B->E and E->D. As we see, B->E is now a "good" one (B is a key) and E->D is still "bad".

Our split:

R0 = BED
R2 = BE (B->E, B is a key)
R3 = ED (E->D, E is a key)

R1, R2 and R3 are the final decomposition.