Database Design – Help with Normalization

database-designnormalization

I'm having difficulty understanding how to go from 1NF to 3NF. All the tutorials I've looked at are very technical and I just need someone to tell me in plain English how to do it.

I have this table in 1NF (I think) and need to do functional dependency diagram for it. I don't understand how to join the group fields/attributes with arrows i.e. I don't understand functional, partial and transitive dependencies. Help please.

Here is my table:1NF

And here is my (unfinished) dependency diagram:

functional dependency diagram

Best Answer

As the old saying goes: Technically correct is the best kind of correct. So following the technical tutorials is something you should do. However, I remember when normalization was first taught to me it didn't click in right away. Let's consider the basic steps in plain, but less accurate, English:

First Normal Form (1NF) says to remove columns containing repeating groups to its own table. Your relation doesn't have any repeating groups in any of the columns, so your relation is already in 1NF.

Second Normal Form (2NF) says that there shouldn't be any non-key attributes which depend on a subset of a multi-part candidate key. You have one multi-part key. That appears to be the details of the appointment, which seem to depend on the patient, the doctor and the date. That means that you need to move the appointment details out to a new relation (along with foreign keys that make a candidate key for the Appointment Details relation. When talking about 2NF, sometimes you'll hear the term "partial functional dependency". You want to avoid those.

Third Normal Form (3NF) says that every attribute in a relation depends only on every candidate key (and nothing else). This is also know as a "transitive functional dependency" - also to be avoided.

If you look at your two relations that you have after achieving 2NF, you have these candidate keys and the following fields which rely on those keys:

  • Patient ID determines: Patient Name, Sex and DOB.
  • Doctor determines itself (there are no non-key attributes of doctor)
  • Patient ID + Appointment Date + Doctor + Details determine the details of an appointment.

NOTE: One other obvious thing that stands out from your sample data is that appointment details is probably actually a compound attribute. It looks like the details are a number and a textual description, jammed together with " - " between them. You could make a case, therefore, that the textual description is determined by the leading number. If that is true of the actual data, then you'd actually want to split out a two-column relation: (e.g.: Treatment ID, Treatment Description) as a separate relation. This would be part of getting to 3NF as well.

If you did split out a treatment relation, then the appointment relation would only have the Treatment ID, not the description in it.

I hope this helps to understand the steps and how to apply them to your example. Stick with the technical tutorials. You'll figure them out soon enough and then you'll start to get an instinctual feel for defining relations in 3NF right from the start. Once you get comfortable with it, you won't usually need to go through a formal normalization exercise. It will just come naturally.