Database Design – To Normalize or Not to Normalize

database-designnormalizationpostgresql

In general, I understand that normalization is usually beneficial even with the join costs. However, I came up with an interesting dilemma recently.

What if the data is duplicated but unlikely to ever change. It's possible, but I would not anticipate it.

I have a nutrients table with a unit column and the units would be g, kg, ug, etc.

I can't see these values every changing.

I'm tempted to just put them as a column in the table rather than normalizing and having a units table and using a foreign key and having to join whenever fetching a row from the nutrients table. At the same time, I know in general, even with the join coins, we should normalize.

What should I do (and why)?

Best Answer

There is a difference between redundant data (bad) and coincidentally repeated data (not bad). Normalization is a technique which is used to avoid insert, update and delete anomalies. It is not meant to eliminate every repetition of a piece of data. Data which is static doesn't benefit from normalization.

A unit of measure, stated as a standardized abbreviation is not the kind of data that you need to normalize out.

Think of it this way: If you were to normalize out your unit of measure into a separate table, you'd need a foreign key from your nutrients table to your units table. Is the unit of measure code going to be unique (probably, yes). Therefore it's a candidate key for your units table. If it's a candidate key in units you could use it as a foreign key in nutrients.

The end result is that you have your unit of measure code in your nutrients table anyway even if you've normalized out the units.

Here's when you would want to create a units table. If you have other predicates (columns) that are dependent on the unit code, but not on the nutrient. For example, a unit_type or a conversion factor to a base unit of the same type (grams for weight, etc.) This would be a transitive functional dependency in your nutrients table and doesn't belong there for that reason.