What exactly is overlapping candidate key

database-designdependenciesnormalization

Can some one please explain to me in simple terms what is overlapping candidate key ? What is overlapping as the name suggests?

Consider the below relation

R(L,M,N,O,P) 
{
    M -> O
    NO -> P
    P -> L
    L -> MN
}

Which of the above functional dependency is bringing in overlapping candidate key in the above relation ?

Lets restrict the discussion to dependencies, I'm not interested in BCNF at the moment

Best Answer

You're right on the money with the possible candidate keys, vikkyhacks. Overlapping candidate keys are composite (consist of more than one attribute) candidate keys with at least one attribute in common. So your overlapping candidate keys are NM and NO (they share N).

Additional explanation of the above, originally left in comments:

All overlapping candidate keys are groups of (e.g. two or more) candidate keys. That means the first criterion is that your relation R must have more than one candidate key (minimal super keys). For any of the candidate keys to be overlapping, each of them (again two or more) must meet a few additional conditions. 1) They both must be composite candidate keys. They must consist of more than one attribute, so a key like A will never be overlapping, but AB might overlap with another key. 2) The composite keys must share an attribute. AB overlaps with AC and BD but not CD or EF.

To summarize: two or more sets of attributes where 1) each set is a candidate key (minimal superkey) for the relation, 2) each set is a a composite key (consists of more than one attribute), and 3) one or more of the attributes of the composite keys overlaps with an attribute of another key in the set. So you can rule out MNOP and NOPL on the basis that they are not minimal superkeys. You can rule out P and L on the basis that they are not composite keys (they consist of one attribute). You're left with two keys, NO and NM, which share the attribute N, so you are finished.

Example

It may also help to have an example you can really wrap your head around. The only time I've ever seen where you will have overlapping candidate keys is when you have 1) two attributes that functionally determine one another (e.g. a one-to-one relationship between A and B where A has one B and B has one A) and 2) these attributes are part of composite candidate keys.

For example, in some system, a Customer has one CreditCard, and a CreditCard belongs to one Customer. In the Rentals table, you uniquely identify a Rental by the EquipmentId, Date and CustomerId. For convenience, you've stored CreditCard on this table as well.

This means that the following FDs hold:

{CustomerId, EquipmentId, Date} -> {CreditCard}
{CustomerId} -> {CreditCard}

But since the association is one-to-one, the following FDs also hold:

{CreditCard} -> {CustomerId}
{CreditCard, EquipmentId, Date} -> {CustomerId}

Since CustomerId and CreditCard can be used interchangeably to uniquely identify your customer.

In the scenario above, you have overlapping candidate keys:

{CreditCard, EquipmentId, Date}
{CustomerId, EquipmentId, Date}

They are overlapping because they are composite keys (they consist of more than one attribute) and because at least one of their attributes is shared (in this case, they share both EquipmentId and Date.

You said you don't care about BCNF at the moment, but to fully bring this home, the scenario above is the reason why you'll occasionally see a table that is in 3NF but not BCNF. The table above is in 3NF, but not BCNF.

3NF allows FDs where 1) the FD is trivial 2) the left side of the FD is a candidate key or 3) the right side of the FD is a key attribute (an attribute used to make any key). Since CreditCard and CustomerId are both key attributes, all FDs either meet 2 or 3.

BCNF is very similar, but it only allows conditions 1 and 2 allowed by 3NF. Since the 3rd condition is not allowed by BCNF, and both CID -> CC and CC -> CID use condition 3, this table is not BCNF, but it is 3NF.

For practical purposes, the case is fairly rare and this information is pedantic. The giveaway that your table has a problem will be the fact that CreditCard/CustomerId pairs are repeated across your table. You also may recognize that the table would not even be in 2NF without this rare condition where the right side of an FD may be a key attribute because CreditCard is a partial dependency on the primary key (it depends on CustomerId but not EquipmentId or Date.

Related Solutions

Why is this relation in 3NF

The mistake is in your understanding of transitive dependency. From wikipedia: Transitive dependency

In mathematics, a transitive dependency is a functional dependency which holds by virtue of transitivity. A transitive dependency can occur only in a relation that has three or more attributes. Let A, B, and C designate three distinct attributes (or distinct collections of attributes) in the relation. Suppose all three of the following conditions hold:

1. A → B
2. It is not the case that B → A
3. B → C

Then the functional dependency A → C (which follows from 1 and 3 by the axiom of transitivity) is a transitive dependency.

In your case though, (2) does not hold:

1. U → T                                  -- correct
2. It is not the case that T → U          -- wrong
3. T → V                                  -- correct

Therefore {V} is NOT transitively dependent on {U}.

What normal forms are these two relations in

The relation is actually in 3NF (so that it is also in 1NF and 2NF). The reason is that each attribute of the relation is prime, that is, it belongs to a (candidate) key (the are four keys in this relation: (A X), (A Z), (X Y), (Y Z)).

The definition of the 2NF (which has only an historical interest), is the following (Database System Concepts, 6th edition, Korth, Silberschatz, Sudharshan):

A functional dependency X → Y is called a partial dependency if there is a proper subset Z of X such that Z → Y . We say that Y is partially dependent on X. A relation schema R is in second normal form (2NF) if each attribute A in R meets one of the following criteria:

• It appears in a candidate key.

• It is not partially dependent on a candidate key.

so the relation respects this definition, since each attribute appear in a key.

While the definition of Third Normal Form is (same source as above):

A relation schema R is in third normal form with respect to a set F of functional dependencies if, for all functional dependencies in F⁺ of the form X → Y, where X ⊆ R and Y ⊆ R, at least one of the following holds:

• X → Y is a trivial functional dependency.

• X is a superkey for R.

• Each attribute A in Y − X is contained in a candidate key for R.

so the relations respects this definition too.

For your second question, the relation has two keys, A and Y, and it is already in BCNF as well as in 3NF.

Note that the analysis algorithm to decompose a relation in BCNF should return the original relation, since no dependency violates its definition (each determinant is a superkey).

Best Answer

Example

Related Solutions

Why is this relation in 3NF

What normal forms are these two relations in

Related Question