I'll answer obliquely...
The natural key is always the natural key and should be enforced with a unique constraint or index. This is the "primary key" that flows from your modelling phase.
The choice of an auto-number/identity surrogate key matters at implementation phase because there are good and bad choices for your clustered index (example: SQL Server, Sybase, MySQL InnoDB, Oracle IOT).
That is, primary key is orthogonal to your clustered index: don't confuse the two issues
I'd suggest using a contrived key adds no value over using an auto-number/identity column in this respect. You lose data from the natural key, probably won't be unique, is just as opaque.
FWIW, I use surrogate keys and composite keys when I need too:
- Some natural keys are useful in their own right: ISO currency and country codes
- A table with no secondary (non-clustered) indexes and no child table doesn't benefit from a surrogate key
- If you have parent-child-grandchild, then I usually need to join parent-grandchild: with composite keys I can do so directly. Simpler JOINs, simpler indexes
Note: this assumes that every table requires a clustered index
Related on dba.se: SQL Server Primary key / clustered index design decision
Situation 1:
Your tables have one relationship and not two. (example: a Device
belongs to a DeviceType
)
So, keep only one relationship, the one with the composite keys (that include the Primary Key). The other relationship is redundant when the composite one is defined.
I would also suggest you have same names for related columns:
DeviceCategory Table
CategoryCode | Name | Description
----------------------------------------
WKS | Workstation | Description of what classifies an item as a workstation...
LPT | Laptop | Description of what classifies an item as a laptop...
DeviceType Table
DeviceTypeID | CategoryCode | Manufacturer | Model | IsTrackedInOtherSystemDefault
-----------------------------------------------------------------------------------
1 | WKS | Dell | GX1000 | true
2 | LPT | HP | dv4000 | false
3 | WKS | HP | xx9000 | false
Device Table
DeviceID | SerialNumber | DeviceTypeID | IsTrackedInOtherSystem | CategoryCode
------------------------------------------------------------------------------
1 | I81U812 | 1 | true | WKS
2 | N0S4A2 | 1 | false | WKS
3 | 3BL1NDMIC3 | 2 | false | LPT
So, the design would be:
DeviceCategory
--------------
CategoryCode PK
Name U1
Description
DeviceType
----------
DeviceTypeID PK U1
CategoryCode FK U1
Manufacturer U2
Model U2
IsTrackedInOtherSystemDefault
Device
------
DeviceID PK U1
SerialNumber U2
DeviceTypeID FK1
IsTrackedInOtherSystem
CategoryCode FK1 U1
and for the Computer
:
Computer
--------
DeviceID PK FK1
Hostname U1
IPAddress U2
CategoryCode FK1 CHK
The "additional" UNIQUE
keys (the two composite U1
ones) will be needed in most DBMS to enforce the foreign key constraints. I guess this answers your question 2, relationships needs indices to be enforced, so (you have to) use them. They will be used by the DBMS not only to enforce integrity but in your queries/statements, when you will be joining the tables.
The only one that is not needed is the U3
you had in the Computer
table.
About question 3 (the over-engineering part): No, I don't think so but that's just my opinion. And you haven't told us if this is a homework/exercise or a real project, whether you will be holding only your family's or a multi-million company's inventory, etc.
Situation 2
I think what you have is fine and there is no need (and not a good idea) to have referential integrity constraints on these columns. This is a default value that is copied in the second table via a stored procedure (I guess during Inserts on the second table?) or altered by a user. If you add an FK, won't that deny users the ability to override the default?
The names of the two columns are self-explanatory enough for a DBA to understand the functionality.
Best Answer
If you were using a person's name as a primary key and their name changed you would need to change the primary key. This is what
ON UPDATE CASCADE
is used for since it essentially cascades the change down to all related tables that have foreign-key relationships to the primary key.For example:
A
SELECT
against both tables:Returns:
If we update the
PersonKey
column, and re-run theSELECT
:we see:
Looking at the plan for the above
UPDATE
statement, we clearly see both tables are updated by a single update statement by virtue of the foreign key defined asON UPDATE CASCADE
:click the image above to see it in more clarity
Finally, we'll cleanup our temporary tables:
The preferred1 way to do this using surrogate keys would be:
For completeness, the plan for the update statement is very simple, and shows one advantage to surrogate keys, namely only a single row needs to be updated as opposed to every row containing the key in a natural-key scenario:
The output from the two
SELECT
statements above are:Essentially, the outcome is approximately the same. One major difference is the wide natural key is not repeated in every table where the foreign key occurs. In my example, I'm using a
VARCHAR(200)
column to hold the person's name, which necessitates using aVARCHAR(200)
everywhere. If there are a lot of rows and a lot of tables containing the foreign key, that will add up to a lot of wasted memory. Note, I'm not talking about disk space being wasted since most people say disk space is so cheap as to be essentially free. Memory, however, is expensive and deserves to be cherished. Using a 4-byte integer for the key will save a large amount of memory when you consider the average name length of around 15 characters.Tangential to the question about how and why keys can change is the question about why to choose natural keys over surrogate keys, which is an interesting and perhaps more important question, especially where performance is a design-goal. See my question here about that.
1 - http://weblogs.sqlteam.com/mladenp/archive/2009/10/06/Why-I-prefer-surrogate-keys-instead-of-natural-keys-in.aspx