Sql-server – Basic RDBMS table relations

database-designsql serversql-server-2008-r2subtypes

If you have a design of a "master table" and several "subtables", is it better* that the master table "link" to each of the sub tables, the subtables link to the master, or for bi-directional links be established? The data present in the subtables COULD be not present (i.e. it might not be relevant to the particular entry, so a linked entry in the subtable won't be created). Basically data will be related in a 1-to-(zero or one) fashion. There will never be a reason to link between the subtables WITHOUT involving the master table.

Specifically I'm concerned about better* with respect to MS SQL Server (2008R2+)

Example Design 1: Master Table links to Sub-tables

CREATE TABLE tMaster (
  PK_TM int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,IntrinsicField1 datetime NOT NULL
  ,IntrinsicField2 int NOT NULL
  ,IntrinsicField3 nvarchar(255) NULL
  ,index_subA int NULL --index created on this value. links to PK_SA of appropriate entry if it exists
  ,index_subB int NULL --index created on this value
  ....
  ,index_subM int NULL --index created on this value
)
CREATE TABLE tSubA
  PK_SA int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,ExtraField1 int NOT NULL
  ,ExtraField2 varchar(24) NOT NULL
) 
CREATE TABLE tSubB
  PK_SB int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,ExtF1 NUMERIC(5,3) NOT NULL
  ,ExtF2 NUMERIC(5,3) NOT NULL
) 
CREATE TABLE tSubM
  PK_SM int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,EField1 int NOT NULL
  ,EField2 nvarchar(100) NOT NULL
)

Example Design 2: Sub-Tables link to Master Table

CREATE TABLE tMaster (
  PK_TM int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,IntrinsicField1 datetime NOT NULL
  ,IntrinsicField2 int NOT NULL
  ,IntrinsicField3 nvarchar(255) NULL
)
CREATE TABLE tSubA
  lMaster_ID int NOT NULL PRIMARY KEY CLUSTERED -- value of PK_TM of cross-linked entry
  ,ExtraField1 int NOT NULL
  ,ExtraField2 varchar(24) NOT NULL
)
CREATE TABLE tSubB
  lMaster_ID int NOT NULL PRIMARY KEY CLUSTERED
  ,ExtF1 NUMERIC(5,3) NOT NULL
  ,ExtF2 NUMERIC(5,3) NOT NULL
  ,Master_ID int NOT NULL --index created on this value. links to PK_TM of appropriate entry
) 
CREATE TABLE tSubM
  lMaster_ID int NOT NULL PRIMARY KEY CLUSTERED
  ,EField1 int NOT NULL
  ,EField2 nvarchar(100) NOT NULL
)

Example Design 3: Bidirectional links

CREATE TABLE tMaster (
  PK_TM int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,IntrinsicField1 datetime NOT NULL
  ,IntrinsicField2 int NOT NULL
  ,IntrinsicField3 nvarchar(255) NULL
  ,index_subA int NULL --index created on this value. links to PK_SA of appropriate entry if it exists
  ,index_subB int NULL --index created on this value
  ....
  ,index_subM int NULL --index created on this value
)
CREATE TABLE tSubA
  PK_SA int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,ExtraField1 int NOT NULL
  ,ExtraField2 varchar(24) NOT NULL
  ,lMaster_ID int NOT NULL
) 
CREATE TABLE tSubB
  PK_SB int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,ExtF1 NUMERIC(5,3) NOT NULL
  ,ExtF2 NUMERIC(5,3) NOT NULL
  ,lMaster_ID int NOT NULL
) 
CREATE TABLE tSubM
  PK_SM int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  ,EField1 int NOT NULL
  ,EField2 nvarchar(100) NOT NULL
  ,lMaster_ID int NOT NULL
)

*better could be

  1. faster
  2. less likely to bork up over time as DB entries are added and/or occasionally deleted
  3. easier for other DBAs to quickly comprehend
  4. more intuitive for the average end-user
  5. more easily extensible by adding additional sub-table types
  6. more data-compact
  7. more easily modified to be a one master to many copies of a subtable
  8. more easily modified to be a many masters to a single copy of a subtable
  9. insert your own definition here

(less emphasis on options 7&8 here)

Best Answer

There is not really much to talk about here. At least, the conventional wisdom is very clear. Only option 2 should be considered unless you have a very, very good reason, borne out of practical experience with your database, to do otherwise.

Neither option 1 nor option 3 have a parent table in first normal form (1NF) because they include repeating columns, i.e. the foreign keys to the sub-tables.

In a transactional system, you should start from the assumption that your tables should be in 3NF at least. Back away from that only for a good, well considered reason.

One such reason might be that there are lots of sub-tables and you don't know which one(s) might need to be joined to the parent for a particular parent record. It could happen that a performance issue results, depending on many factors. One compromise that you could use if this problem did occur, which you might consider Option 4 is to have bit flags on the parent table indicating the presence of each type of child. This is still begging for an eventual data inconsistency problem, but it at least will be less inconsistent than option 3, because it risks only pointing you to an empty join rather than to an incorrect record.