Database Design – Is This Design in 3NF and Why Choose Another Design?

database-designnormalization

Having the following entities: Objects, Parts, Subparts and Items, and given the following design restrictions:

  • A Part belongs to one and only one Object.
  • A Subpart belongs to one and only one Part.
  • An Item belongs to one and only one Subpart.

I have implemented the following tables (look at the composite primary keys marked with *):

  • Object (*IDObject)

  • Part (*IDPart, *IDObject): IDObject is a Foreign key (the prim. key of Object).

  • Subpart (*IDSubpart, *IDPart, *IDObject): *IDPart, *IDObject is a Foreign key (the prim. key of Part).

  • Item (*IDItem, *IDSubpart, *IDPart, *IDObject): *IDSubpart, *IDPart, *IDObject is a Foreign key (the prim. key of Subpart).

1.- I have understood that this design is in 3NF, isn't it?

Assuming that–by theory–Item is a weak entity depending on Subpart; Subpart is a "weak entity" depending on Part, and Part is a "weak entity" depending on Object, these would be the reasons for those composed keys, making this design in 3NF.

But for other people, this design is redundant, and the "best" way for resolving it is through the following tables specification (look at the primary keys marked with *):

  • Object (*IDObject)
  • Part (*IDPart, IDObject): IDObject is a Foreign key, but doesn't make part of the prim. key of Part.
  • Subpart (*IDSubpart, IDPart): IDPart is a Foreign keys, but doesn't make part of the prim. key of Subpart.
  • Item (*IDItem, IDSubpart): IDSubpart is a Foreign keys, but doesn't make part of the prim. key of Item.

In this specification, the restrictions of "one and only one" would be managed by code.

2.- Why would I choose this implementation, declining to use 3NF?

Best Answer

Both designs are technically 3NF, depending on your data population.

What a lot of people who have only a cursory understanding of formal normalization theory don't understand is that redundancy in foreign keys isn't really redundancy by definition. Transitive dependencies and partial dependencies apply to the way non-key attributes relate to key attributes.

If your composite keys happen to contain (or even consist of) foreign keys then whether your foreign key has the look of being denormalized isn't relevant. (i.e. Design 1 isn't redundant in a way which is significant to the formal definition of normal forms.)

The practical issue comes down to what the values of your keys are. If you have natural keys and the Part key consists of IDObject=123 and IDPart=1 - i.e. your IDPart values are natural number sequences which are only unique within the context of a given IDObject value, then you need to use design 1 (weak entities).

However, what a lot of people do for good (and less good) reasons is apply surrogate primary keys to every or nearly every table. I'm not going to get into whether or why to use surrogate keys. That is a holy war for another question. However, if you happen to have table-wide unique (e.g. IDENTITY) values for IDPart, then your primary key for Part is more complex than it needs to be using design 1. In that case, there is no need for IDObject in the PK of Part because the table already has a unique key just using IDPart. In such a case, design 2 is sufficient and simpler, therefore better, as a rule of thumb.

In practice, I would say weak entities are much more common in logical models than in physical models, if for no other reason than that most physical models tend to use surrogate primary keys.