Normalization – How to Decompose Relation into 3NF Relations

normalizationrelational-theory

Today I read about the 3NF decomposition algorithm. It said:

  1. Find a minimal basis of F, say G
  2. For each FD X → A in G, use {X, A} as the schema of one of the
    relations in the decomposition
  3. If none of the sets of relations from Step2 is a superkey for R, add
    another relation whose schema is a key for R

I want to decompose this relation into 3NF.

R(A,B,C)
S={A→B, A→C, B→A, B→C, C→A, C→ B, AB→ C, BC→A, AC→B, A→BC, B→AC, C→AB}

As we can see, the key of R is: {A},{B},{C}

S has several minimal basis, such as:

  1. {A→B, B→A, B→C, C→B}; and
  2. {A→B, B→C, C→A}

The problem is, if we use the 1st minimal basis, then we decompose R into 2 relations: (A, B), (B, C).

If we use the 2nd minimal basis, R turns into: (A, B), (B, C), (C, A).

My question is: which one is correct?

Best Answer

First of all, note that the original relation is already in Third Normal Form, since each attribute is prime (each attribute is a key, actually), so that the definition of 3NF is respected.

Then, note that the algorithm is incomplete. The steps are:

  1. Find a minimal basis of F, say G
  2. For each groups of FD with the same left part, X → A1, X → A2, ..., X → An in G, use {X, A1, A2, ..., An} as the schema of one of the relations in the decomposition
  3. Delete all the relations whose attributes are contained in another relation.
  4. If none of the sets of relations from Step2 is a superkey for R, add another relation whose schema is a key for R.

So in the first case, you obtain three groups of dependencies:

A → B
B → A
B → C
C → B

that produce three relations, R1(A, B), R2(A, B, C), R3(B, C), and, following the algorithm, you obtain as result only R2, since the other two have attributes contained in them.

So you have two different outputs from the algorithm, depending on the minimal base used (which in turn depends on the order in which you consider the dependencies when calculating the minimal cover).

So, the answer to your question:

which one is correct?

is: both of them are correct, since both of them satisfies the definition of the 3NF. You have simply discovered that the synthesis algorithm for decomposing a relation in 3NF can produce different solutions.

A different question is: which is “better”, and of course the solution with a single relation is “better”, since you do not need to join tables when making queries.

Of course, if one could check at the beginning if the relation is already in 3NF, than he can avoid to apply the algorithm. But this in general cannot be done, since the check requires the exponential computation of all the keys, to find the prime attributes of the relation.