What exactly is overlapping candidate key

database-designdependenciesnormalization

Can some one please explain to me in simple terms what is overlapping candidate key ? What is overlapping as the name suggests?

Consider the below relation

R(L,M,N,O,P) 
{
    M -> O
    NO -> P
    P -> L
    L -> MN
}

Which of the above functional dependency is bringing in overlapping candidate key in the above relation ?

Lets restrict the discussion to dependencies, I'm not interested in BCNF at the moment

Best Answer

You're right on the money with the possible candidate keys, vikkyhacks. Overlapping candidate keys are composite (consist of more than one attribute) candidate keys with at least one attribute in common. So your overlapping candidate keys are NM and NO (they share N).

Additional explanation of the above, originally left in comments:

All overlapping candidate keys are groups of (e.g. two or more) candidate keys. That means the first criterion is that your relation R must have more than one candidate key (minimal super keys). For any of the candidate keys to be overlapping, each of them (again two or more) must meet a few additional conditions. 1) They both must be composite candidate keys. They must consist of more than one attribute, so a key like A will never be overlapping, but AB might overlap with another key. 2) The composite keys must share an attribute. AB overlaps with AC and BD but not CD or EF.

To summarize: two or more sets of attributes where 1) each set is a candidate key (minimal superkey) for the relation, 2) each set is a a composite key (consists of more than one attribute), and 3) one or more of the attributes of the composite keys overlaps with an attribute of another key in the set. So you can rule out MNOP and NOPL on the basis that they are not minimal superkeys. You can rule out P and L on the basis that they are not composite keys (they consist of one attribute). You're left with two keys, NO and NM, which share the attribute N, so you are finished.

Example

It may also help to have an example you can really wrap your head around. The only time I've ever seen where you will have overlapping candidate keys is when you have 1) two attributes that functionally determine one another (e.g. a one-to-one relationship between A and B where A has one B and B has one A) and 2) these attributes are part of composite candidate keys.

For example, in some system, a Customer has one CreditCard, and a CreditCard belongs to one Customer. In the Rentals table, you uniquely identify a Rental by the EquipmentId, Date and CustomerId. For convenience, you've stored CreditCard on this table as well.

This means that the following FDs hold:

{CustomerId, EquipmentId, Date} -> {CreditCard}
{CustomerId} -> {CreditCard}

But since the association is one-to-one, the following FDs also hold:

{CreditCard} -> {CustomerId}
{CreditCard, EquipmentId, Date} -> {CustomerId}

Since CustomerId and CreditCard can be used interchangeably to uniquely identify your customer.

In the scenario above, you have overlapping candidate keys:

{CreditCard, EquipmentId, Date}
{CustomerId, EquipmentId, Date}

They are overlapping because they are composite keys (they consist of more than one attribute) and because at least one of their attributes is shared (in this case, they share both EquipmentId and Date.

You said you don't care about BCNF at the moment, but to fully bring this home, the scenario above is the reason why you'll occasionally see a table that is in 3NF but not BCNF. The table above is in 3NF, but not BCNF.

3NF allows FDs where 1) the FD is trivial 2) the left side of the FD is a candidate key or 3) the right side of the FD is a key attribute (an attribute used to make any key). Since CreditCard and CustomerId are both key attributes, all FDs either meet 2 or 3.

BCNF is very similar, but it only allows conditions 1 and 2 allowed by 3NF. Since the 3rd condition is not allowed by BCNF, and both CID -> CC and CC -> CID use condition 3, this table is not BCNF, but it is 3NF.

For practical purposes, the case is fairly rare and this information is pedantic. The giveaway that your table has a problem will be the fact that CreditCard/CustomerId pairs are repeated across your table. You also may recognize that the table would not even be in 2NF without this rare condition where the right side of an FD may be a key attribute because CreditCard is a partial dependency on the primary key (it depends on CustomerId but not EquipmentId or Date.