BCNF – Understanding Why R1 is Not in BCNF

database-designnormalizationrelational-theory

Given that R(ABCDE)

AB->C

BC->D

CD-> E

DE->A

Since, B is the only attribute absent on the right-hand side, we would start with B.

{B}+ = B . Not a CK.

Next, we combine different attribute with B.

{AB}+ = AB = ABC = ABCD = ABCDE. This is a CK.

{BC}+ = BC = BCD = BCDE = BCDEA. This is a CK.

{BD}+ = BD . Not a CK.

{BE}+ = BE . Not a CK.

{BDE}+ = BDE = BDEA = BDEAC. This is a CK.

So, total number of CK/minimal keys are 03 (three).

Since, in some FDs, attributes are not dependent on keys, R is not in BCNF.


(1) Finding the violating FDs:

CD -> E and DE -> A are the violations of BCNF. Coz, in these FDs, attributes are nor dependent on keys.

(3) Decomposition:

Start with any one of the invalid FDs:CD -> E .

(i) Find the closure of CD and decompose:

R1 = {CD}+ = CD = CDE = CDEA = ACDE

R2 = (R - R1) + {CD} = (ABCDE - ACDE) + CD = BCD

Now, we will have to check to see whether both R1 and R2 in BCNF.

R1(ACDE), R2(BCD)

AB->C

BC->D

CD-> E

DE->A

I have a question, how did we know that R1 was not in BCNF and needs to be decomposed?

Since, CD->E is already applied in a decomposition, we don't need to use that again. Coz, it will produce the same result.

(1) Finding the violating FDs:

DE->A is violating the BCNF and it is also not applied yet.

(2) Decomposision:

Start with DE->A.

(i) Find the closure of DE and decompose:

R3 = {DE}+ = DEA = ADE.

R4 = (R1 - R3) + {DE} = (ACDE - ADE) + {DE} = CDE.

So, our final BCNF decompositions are,

R2{BCD}, 
R3{ADE},
R4{CDE}.

Best Answer

You are asking:

I have a question, how did we know that R1 was not in BCNF and needs to be decomposed?

In the BCNF Decomposition Algorithm, when a relation is decomposed, one should find the dependencies of the subschemas, in this case R1(ACDE) and R2(BCD).

Let’s start from R1. To find the dependencies that hold in R1, one should actually project the original dependencies over the subschema, but, for simplicity, we would consider only those dependencies in which all the attributes are present in the subschema.

So, in R1 the dependencies {C D → E, D E → A} hold, C D is the key, and the algorithm continues by searching if the dependency D E → A violates the BCNF (of course the first one does not violates it since C D is the key of R1).

And it is easy to see that D E → A violates the BCNF since D E is not a superkey of the relation, so that now R1 must be decomposed.

So, the answer to your question is:

The algorithm knows that R1 is not in BCNF simply by checking if the dependencies that hold in R1 have a determinant which is not a superkey of R1.


From comments:

After the 1st decomposition, we should calculate the minimal keys again. Am I correct?

We know the key of R1 because of the way in which it is built (as the closure of the determinant of the original dependency CD -> E). We must find the key of R2, instead.

Which part of my answer needs to be corrected?

Simply cancel the sentence "Since, CD->E is already applied in a decomposition, we don't need to use that again. It will produce the same result.", and continue with "(1) Finding the violating FDs: DE->A is violating the BCNF and it is also not applied yet..."