Your points are unrelated to database design: choice of natural or surrogate key is an implementation decisions after conceptual and logical models are complete
In addition to comments and other answers:
- some natural keys work well such as currency or language codes (CHF, GBP, DE, EN etc)
- avoiding composite keys forces you to always join intermediate tables (rather than simple) parent-grandchild
- adding a surrogate key in unnecessary for link tables
Edit: example of "composite keys"
Assume: t1 has child t2 has child t3
If you had the key of t1 in t3 (composite key) you can join t1 and t3 directly.
t1 key is also the left hand column of t3 key so you don't need an extra index
With a surrogate key/FK, you have to join via t2
You need extra indexes on the FK columns in t2 and t3 which
This latter option with the "always use surrogate key" dogma
- adds complexity
- decreased or reverses disk space "savings"
Both solutions for T or T-alternate are considered denormalizations. Denormalizations optimize a datamodel that is normally in third normal form into a structure that is more convenient for select queries. The correct data already exists in CC and CD. T or T-alternate are duplicating the data. With duplicate (denormalized) data you have to make sure that data cannot get out of synch with its parent.
I think the first option is the better one. The constraints are simple and straight-forward. There is no check constraint on multiple fields with nulls. You need to make sure that any insert into CC or CD also inserts into T. After-insert triggers on CC and CD would automatically insert A,B,C,null into T or A,B,null,D into T.
One thing to plan for is that a parent has two children, each with many rows per parent. Lets assume a row in P has 3 rows in CC, 0 in CD. If you write a query over P inner joining to CC and CD, it shows zero rows. An left outer join (parent on left) shows 3 rows.
Lets assume CC has 3 rows and CD has 4 rows. Inner join and outer join queries would show 3*4 = 12 rows.
Lets assume a 3rd set of data. CC is the main child table with 20 rows per P. CD is a descriptive text field with 3 rows per P. You could reduce the resulting row to 20 per P with an aggregation function like Oracle 11g's LISTAGG
over CD. This combines the D descriptive field into a comma separated list "Planes, Trains, Automobiles".
Having CCA and CDA fields in T-alternative increases the chances for data to get out of synch, since CCA might not equal CDA when something goes wrong in the code. The T-alternative is not the standard way of approaching this problem. The first option does have some challenges, but it is simpler, more standard, and more likely to hold up under production systems.
Best Answer
Both composite key and compound key describe a candidate key with more than one attribute. According to the Relational Database Dictionary (C.J.Date) they mean the same thing.
In ER modelling the term "compound key" also has a more specific meaning. It means a key whose constituent attributes are references to keys in other entities - i.e. a compound key forms an identifying relationship. For most purposes, this isn't an especially useful or important concept so the terms composite/compound are often treated as interchangeable. It's probably best to stick to "composite key" unless you are referring specifically to the ER modelling concept of a compound key.