Why does key “concatenation” help to bring an unnormalized table to 1NF

database-designnormalizationprimary-key

This question is about the proposed technique to bringing an unnormalized table to first normal form (1NF) that my textbook "A Guide To MySQL" presents. It gives an example of an unnormalized table with repeating groups:

ORDERS (ORDER_NUM, ORDER_DATE, (PART_NUM, NUM_ORDERED)).

There are multiple entries in the PART_NUM column because a customer can order many parts at once (associated with a single order). The solution is to create a composite key (ORDER_NUM, PART_NUM), so now there are multiple rows in the ORDERS table with the same ORDER_NUM but different PART_NUM that represent a customer ordering multiple parts.

My question is how does that solution make repeating groups go away? Couldn't I have a composite key with ORDER_NUM and PART_NUM but with PART_NUM still having multiple entries? I'm confused. Unnormalized table:

Normalized to 1NF table:

Best Answer

Using the schema shown, you are GOING to have multiple instances of the same PART_NUM, but the ORDER_NUM will be different for each of those. The key is the unique combination of ORDER_NUM and PART_NUM.

The composite key will permit one record per unique combination of order and part. A specific part can be associated with any order. Presumably the NUM_ORDERED column specifies the quantity of that particular part on that order.

Personally, I might use a surrogate key with a somewhat different schema. But that isn’t what your textbook is trying to illustrate in this case.

OP's Comment

I realized the uniqueness gained by order and part making up the primary key. I guess I'm still wondering if/how that actually gets rid of repeating groups (the whole point of 1NF). I definitely see an improvement in the design, but it feels like my book is leaving out part of the explanation.

Comments incorporated into the answer

I dislike the design in your textbook. I would use 3 tables: orders, parts, and orders_parts (a many-to-many join table). The order and the part are defined one time each, then referenced from the join table to put parts on orders.

There is more to a part and an order than just the id. All of that definition data is in the parts and orders tables, specified just once. The ids are then referenced in the orders_parts join table, where they are foreign keys. The join table is just keys. It lets you put multiple parts on an order, and put the same part on multiple orders, without any redundant data.

Don't let the repetition of keys throw you. The data is not repeated. You don't define any part more than once. You don't define any order more than once. But you do reference the same parts and orders multiple times. In the example just below, Part 10 is referenced on Orders 1 and 2. Part 11 is referenced on Orders 1 and 3, etc.

Order | Part
--------------
  1      10
  1      11
  1      12
  2      10
  2      12
  3      11

The two foreign keys combined are also a composite key (unique and/or primary), if the design calls for it. This is what makes the database relational.

Related Solutions

Ms-access – Does this database table violate normalization rules

#1. It took me a minute or two to find the problem. In it's current form, your WallContacts table requires one record per person per wall. So, if Joe is the owner of a building with 2 walls, he will need 2 records in WallContacts (one for each wall). This is because you're storing the WallID in that table.

Try this: Remove the WallID from WallContacts. Create a new table:

Table: Walls_vs_Contacts (I can't think of a better name)
ContactID (FK)
WallID (FK)

This table will serve as the go between between WallContacts and WallsMaster. So when creating queries, you join WallsMaster to Walls_vs_Contacts to WallContacts.

#2. A less important issue is that your WallInteraction doesn't include ContactID. Also, WallInteraction table can't properly record interactions between the staffmember and 2+ people. If this is an issue worth fixing (that's up to you to decide), you'd have to make an additional many-to-many table like in #1:

Table: Walls_vs_Contacts (I can't think of a better name)
ContactID (FK)
InteractionID (FK)

#3. Depending on how many staff members you have, you might want to make a table with a StaffID and StaffName fields. Otherwise, WallInteractions.Staffname will fill up with "Sheryl","Sherri","Sherryl LastName","S. Lastname", etc. making it impossible to search for all interactions involving her.

Otherwise, you're off to a solid start. I like how you identified and properly named the unique keys in advance. (Oh, and if you think I'm wrong, I probably am.)

Is it “OK” to have a table that only contain foreign keys

There is no problem having a table that consists of foreign keys only. In your particular example it would be OK to remove the OrderID. Relational theory states that each table should have a candidate key so, ideally, the business rule you mention would be in place to make the foreign key only table a "good" table. In practice, DBMS software will not care and will allow a table without a unique constraint.

There will not, generally, be any problem with inserts as long as you observe the uniqueness business rules you decide for your system. Conversely, without uniqueness, you may end up with duplicate rows from a SELECT which the application would have to deal with.

If you add further columns in the future there will be no problems due to the existing columns all being foreign keys.

This situation is very common when there is a many-to-many relationship between entity types. This is implemented as an intersection table containing the foreign keys.