Oracle – Advice on Database Normalization

database-designdenormalizationoracle

I am working on a basic blood bank database on Oracle. I would like to create a completely basic normalized table and then I will add more details to it. I have done it to the 2NF and I will further proceed on to the 3NF afterwards, but first can someone take a look at my normalization and help me improve it (if needed) or any advices regarding it.

my unnormalized and 1st nf normalized tables

2nd nf tables

Best Answer

Excellent points by @joishi-bodio.

Here's a simple diagram explaining the different entities, attributes, and their relationships in your model.

The first step in modeling any database is to identify the real-world entities; the attributes that define each entity; and then their relationships and cross references.

Here are different observations and conclusions that you may draw from the example -

Entities and Attributes -

The most obvious real-world entity looking at your example is PERSON.
The entity Person can be defined with attributes like Name, Age, Gender, DOB, Blood group type
A PERSON can be either a 'Blood Donor' or 'Blood Receiver' or both. 'Donor' and 'Receiver' sound like the different roles that a Person takes, but are not different entities by themselves.
Blood Group can be another entity and can be defined by the name of that blood group among other things.
In a hospital, blood is stored in blood bags, which could be another entity, defined by who the donor for that blood bag is.
Transactions is another table which is the store for all blood transactions. A transaction can be defined by a blood bag being assigned to a person (receiver) on a date.

Relationships:

Every person has exactly 1 blood group. You can have 0 or many people with that group.
Each bloodbag has blood from 1 person. A person may or may not have blood in many blood bags.
You can assign 0 or many blood bags to a receiver. A person may or may not be a donor to one or many blood bags.

We get something similar to -

First Normal Form (1NF)

You cannot say a table is or is not in 1NF because no two rows contain repeating information. By repeating information you may be thinking that BOOK doesn't look like this:

BOOK { BookTitle, AuthorName1, AuthorName2, AuthorName3, BookType, ListPrice, AuthorAffil, BookPublisher  }

This really isn't the definition of 1NF. Instead, BOOK is in 1NF only if each column in every row will have a single value of whatever type, no matter how arbitrarily complex, that is defined for that column. While adding multiple AuthorNames columns may be bad design, it does not violate 1NF as each contains the name of a single author. Now what would violate 1NF is if we had an AuthorNames column and defined it to contain a comma separated list of author names. In that case we know absolutely BOOK is not in 1NF.

Its also possible we have designed the table variable to be in 1NF, assuming only a single author name would ever be placed into the AuthorName column, but our users start putting in comma separated lists of authors. Now the resulting table is not in 1NF even though we meant it to be. In both these cases the table isn't even a relational table anymore because it no longer follows the discipline necessary to gain the properties of a relation.

Secondly, you cannot say BOOK is or is not in 1NF without the presence of at least one candidate key! As stated right now, BOOK has no candidate keys and thus the same value is allowed to be entered for every column in two or more rows. This would result in duplicate rows and a table with duplicate rows is not a relational table!

Second Normal Form (2NF)

It appears from looking at the attempt to decompose BOOK a candidate key of BookTitle,AuthorName was assumed. Even if this assumption were correct, it is not correct to say the original table is now in 2NF. Instead, of the 3 tables resulting from the decomposition, each must now be evaluated with respect to the functional dependencies. R2 is in 5NF as its single non key column is fully functional dependent on the key and that key is the only candidate key and it is a single column. R3 is in 5NF as its only non key column (the assumed "extras") are also fully functionally dependent on the single candidate key and there are no additional candidate keys. It is only R1 that is now in 2NF but not 3NF as it has a non key column, ListPrice, that is functionally dependent on another non key column, BookType, and thus forms a transitive dependency.

The assumption of a single candidate key of BookTitle,AuthorName may also not be correct! Perhaps in this particular book world every book of interest has only one author. If that were the case then the creation of R3 is incorrect with respect to the actual functional dependencies. Thus it is vital to clarify all the candidate keys and all the functional dependencies, join dependencies, and multi-value dependencies, with the subject matter experts and to never assume them. Otherwise the wrong design results. Assuming many authors per book when really there can be only one per book results in an R3 that allows the FD of BookTitle --> AuthorName to be violated. Assumming one author per book when really there can be many per book results in a table similar to the original BOOK where users find they only have room for one author name, not the many they need, and they find their only choice is to enter sets of author names, delimited by something like commas or pipes.

References

Fabian Pascal's Practical Database Foundation Series and CJ Date's Relational Theory for Computer Professionals are excellent references for understanding what makes a table a relational table and the fundamentals of normalization. It is from these sources that all the information in this answer was derived.

Best Answer

Related Solutions

Approach to Analyzing Database Requirements

Database Normalization

First Normal Form (1NF)

Second Normal Form (2NF)

References

Related Question