Normalization with Two Potential Primary Keys – Best Practices

database-designdependenciesnormalizationrelational-theory

I have the following STUDENT relation in first normal form, where id has been identified as the primary key (email here is also unique for all tuples):

STUDENT(id, email, first_name, last_name)

I want to normalize this relation/table such that it is in third normal form (3NF).

The dependencies that I've been able to identify from the above STUDENT relation/table are the following

Full:

{ id } → { email }

Transitive:

{ email } → { first_name, last_name }

Here's the part where I'm confused. I've identified email as a transitive dependency as it can be identified by the primary key (id), and the attribute/column email itself can identify and determine the first_name and last_name.

…if I was to continue with this idea, I would eventually get to a 3NF which looks like so:

STUDENT(id, email)
STUDENT_INFO(email, first_name, last_name)

However, the STUDENT relation/table here seems very redundant, as all it does is point to another relation/table. Is my normalization process here correct? Or did I get my dependencies wrong? It seems that I will always run into this problem whenever the initial relation/table has two candidate keys which both could either be the primary key (id or email both could uniquely identify each row).

Best Answer

Here is where I went wrong:

The important thing about transitive dependencies are these two key factors:

  1. If B depends on A, and C depends on B, then C also depends on A (A → B → C)
  2. B is not (or part of) a candidate key

In my example, I identified the following as a transitive dependency:

{ email } → { first_name, last_name }

However, this is not the case, as email is a candidate key, and so instead, if we use id as the primary key, we have a full dependency:

{ id } → { email, first_name, last_name }