Database Design – When to Extend to Child Tables

database-design

I have a table of entities. For example purposes, lets call them Vehicles: Car, Boat, Motercycle, etc.

I'd ideally like to place all entities in a single Vehicles table because a lot of related tables (VehicleRatings, VehicleComments, VehicleHistory, etc) are going to apply to any entity, not just one of them.

But some of the entities have a few individual properties that aren't shared with the other entity types. Right now, there is only 1 extra property for one entity, and 2 for another. I expect there might be a few more that get identified later on, but not many overall.

How do I know when it's better to create a child table for each entity type to store these individual properties, as opposed to just adding an extra column in the parent Vehicles table? Is there any questions I can ask myself to help determine this answer?

I am looking to optimize for query performance primarily, followed by ease of maintenance.

Best Answer

From a logical ERD perspective, the appropriate design is clearly the entity supertype/subtype pattern. What you're wrestling with is how to implement this from a physical perspective.

Like a lot of data modeling issues, there is no hard and fast rule. You are looking at a compromise. Do you want a fair bit more complexity in your application logic and queries (supertype/subtype) or do you want the risk that your application logic might not properly enforce your constraints and that a Car record in your consolidated vehicle table might get a non-null value in the anchor_weight column?

The kinds of things that you should consider in deciding how to trade off these alternatives are:

  • How many subtype dependent columns are there?
  • How likely is it that new subtypes will be defined in the future - or that new subtype dependent columns will be needed?
  • How serious would it be if inappropriate data (column values) were recorded for some records?
  • Where do you stand philosophically? (e.g. "Null columns are evil", "Application-enforced constraints are evil", "1:1 relationships are evil", "Inner joins are evil", etc.)
  • How many queries do I have which target only one subtype?
  • Do I have any programming environment constraints which make supertype/subtype physical models harder to work with (e.g. heavy use of ORMs or simplistic tools like Access).

At the end of the day, you are going to make a practical decision based on what is most important to you. Whatever you decide will have pros and cons - but they will be your pros and cons.