Normalization – How to Track Functional Dependencies in BCNF Decomposition

normalization

I have the following relation and functional dependency set:

R(A,B,C,D,E,G,H)
F = {AB -> CD, E -> D, ABC -> DE, E -> AB, D -> AG, ACD -> BE}

I am really confused about how to keep track of functional dependencies when decomposing R.

For example, if we decompose:

R1 = (ABCD)
R2 = (DEGH)

then (the easy ones):

F1 = {AB -> CD}
F2 = {E -> D}

So what I want to know is, what is the best STRATEGY of keeping track of the rest of the functional dependencies in F1, F2.

Is it enough to find the closure of all subsets of ABCD to complete F1? or should I find closures of all subsets of ABCDEGH before the decomposition occurs? I'm actually not quite sure how this would happen. for example:

Before the decomp:

D+ = DAG

After the decomp

D+ = D? 

Best Answer

Decomposition and Functional Dependencies

In general, when a relation R(T) with dependencies F is decomposed in two relations R1(T1), R2(T2), the dependencies holding in the two relations cannot be immediately derived from the original set of dependencies F. This is because there could be dependencies implied by F, that is in F+, that hold in the decomposed relations. If we want to know all the dependencies that hold in the decomposed relations, we should consider the projection of the dependencies F over the attributes of the decomposed relations.

Here is the formal definition of projection of a set of dependencies F: Given R, and TiT, the projection of F on Ti is equal to πTi(F) = {X → Y ∈ F+ | X,Y ⊆ Ti }. In other words, one should calculate F+, the closure of F, and then get all the dependencies that have all the attributes in Ti.

But, since the computation of F+ is an exponential task, the computation of a projection of a set of functional dependencies is an exponential task and it is not performed in practice (a part from simple cases with very few attributes and dependencies).

What instead can be decided with a polynomial algorithm is the answer to this question: does the decomposition R1(T1), R2(T2) of R(T) with dependencies F preserves the dependencies? (see J. Ullman, Principles of Database Systems, Computer Science Press, 1983).

By definition, a decomposition preserves the dependencies if the closure of the dependencies projected over the decomposed relations is equal to the closure of the original relation. Formally, a decomposition Ri(Ti) preserves the dependencies if and only if (∪πTi(F))+ = F+.

Why we are interested in FDs of decomposed relations?

When decomposing for a specific reason, for instance to normalize a relation, we are interested in knowing if the decomposition preserves the dependencies (otherwise it is not a good decomposition).

But, while in the synthesis algorithm for the 3NF we are guaranteed that the decomposition alway preserves the dependencies, the same is not true for the BCNF. On the contrary, there are examples of relations that actually cannot be decomposed in BCNF without losing the dependencies (for instance, R(A,B,C), F={AB → C, C → A}).

Summary

  1. When we decompose for BCNF, we should be prepared to lose some dependency (and, for this reason, in practice the 3NF is used).

  2. The analysis algorithm of BCNF requires an exponential task to be performed, since at each step one should check if a decomposed relation is still not in BCNF, and to know this one should know the projection of the dependencies over it (or at least its cover).

A BCNF for your example

In your example, a BCNF obtained with the classical analysis algorithm is the following (starting with a minimal cover of F and producing a decomposition with a cover of the projection of the dependencies):

R2 < (A B H) , { } >

R3 < (A D G) , { D → A, D → G } >

R4 < (B C D E) , { E → D, E → B, C D → B, B D → C, B D → E } >

Note that in this decomposition the two original dependencies:

{ A B → C D
  A B C → D E }

are lost.