Why does key “concatenation” help to bring an unnormalized table to 1NF

database-designnormalizationprimary-key

This question is about the proposed technique to bringing an unnormalized table to first normal form (1NF) that my textbook "A Guide To MySQL" presents. It gives an example of an unnormalized table with repeating groups:

  • ORDERS (ORDER_NUM, ORDER_DATE, (PART_NUM, NUM_ORDERED)).

There are multiple entries in the PART_NUM column because a customer can order many parts at once (associated with a single order). The solution is to create a composite key (ORDER_NUM, PART_NUM), so now there are multiple rows in the ORDERS table with the same ORDER_NUM but different PART_NUM that represent a customer ordering multiple parts.

My question is how does that solution make repeating groups go away? Couldn't I have a composite key with ORDER_NUM and PART_NUM but with PART_NUM still having multiple entries? I'm confused. Unnormalized table:Unnormalized Table

Normalized to 1NF table:enter image description here

Best Answer

Using the schema shown, you are GOING to have multiple instances of the same PART_NUM, but the ORDER_NUM will be different for each of those. The key is the unique combination of ORDER_NUM and PART_NUM.

The composite key will permit one record per unique combination of order and part. A specific part can be associated with any order. Presumably the NUM_ORDERED column specifies the quantity of that particular part on that order.

Personally, I might use a surrogate key with a somewhat different schema. But that isn’t what your textbook is trying to illustrate in this case.

OP's Comment

I realized the uniqueness gained by order and part making up the primary key. I guess I'm still wondering if/how that actually gets rid of repeating groups (the whole point of 1NF). I definitely see an improvement in the design, but it feels like my book is leaving out part of the explanation.

Comments incorporated into the answer

I dislike the design in your textbook. I would use 3 tables: orders, parts, and orders_parts (a many-to-many join table). The order and the part are defined one time each, then referenced from the join table to put parts on orders.

There is more to a part and an order than just the id. All of that definition data is in the parts and orders tables, specified just once. The ids are then referenced in the orders_parts join table, where they are foreign keys. The join table is just keys. It lets you put multiple parts on an order, and put the same part on multiple orders, without any redundant data.

Don't let the repetition of keys throw you. The data is not repeated. You don't define any part more than once. You don't define any order more than once. But you do reference the same parts and orders multiple times. In the example just below, Part 10 is referenced on Orders 1 and 2. Part 11 is referenced on Orders 1 and 3, etc.

Order | Part
--------------
  1      10
  1      11
  1      12
  2      10
  2      12
  3      11

The two foreign keys combined are also a composite key (unique and/or primary), if the design calls for it. This is what makes the database relational.