First Normal Form, why is it good and how does it reduce redundancy

normalizationrelational-theory

I asked a similar question on SO and was advised to ask the type of question here. This is for a course on relational databases. A sample problem asks "how come every table in a relational database should be in First Normal Form" my first problem with this is isn't 3NF considered better so wouldn't the statement be false? My second problem with this is on SO people said that First Normal Form is good because it eliminates redundancy. I don't see how it does this because 1)all values are atomic 2)there is a primary key – neither of which eliminates redundancy.

I guess another way to ask it is why is it important for values to be atomic?

Best Answer

The primary importance of first normal form is not that it eliminates redundancy, but rather, it's that it eliminates repeating groups.

Instead of having multiple columns of the same kind of data in a record, (0NF) you remove the repeated information into a separate relation and represent them as rows. This is what constitutes 1NF.

Tables that have columns like: phone_1, phone_2, phone_3 or that contain list-oriented data like: 212-555-1212, 212-555-1234, 416-967-1111 violate 1NF.

1NF is important because it is much more flexible than 0NF while being much easier to use when inserting, updating and reading data. This is because every type of data element (e.g. customer phone number) has exactly one column in which to find it and that column has only one piece of data for each record. This means that you can use simple SQL statements to read or write individual data elements without having to parse delimited strings or use constructions like: where phone_1=@Number or phone_2=@Number or phone_3=@Number and so forth.

Regarding 1NF vs 3NF, the normal forms are cumulative. A table in 3NF is also in 1NF, so it is just as true to say that "Every table in a relational database should be in 1NF" as it is to say that "Every table in a relational database should be in 3NF." I would say both of these are true, but I would add "unless you have a really good, well considered reason to denormalize".