Database Design – Deciding Between Complete Disjoint or Incomplete Overlapping in Supertype/Subtype

database-designsubtypes

I'm building an inventory database that stores IT hardware, such as desktop computers, laptops, switches, routers, mobile phones, etc. I'm using supertype/subtype pattern, where all devices are stored in a single table, and specific information is put into subtype tables. My dilemma is choosing between the following two designs:

enter image description here

In the top diagram all devices share common subtypes. For example, desktop computers and laptops would have records in the following tables: Device, NetworkDevice. A switch would have records in: Device, NetworkDevice. A router would have records in: Device, NetworkDevice, WANDevice. Any device for which we track location will have a record in Location. Some pros and cons that I thought of for this setup:

  • Pro: SELECTing records based on a common field, like Hostname, or LocationID is easier.
  • Pro: No null fields.
  • Con: Tables that should be included in CRUD operations for a particular device are not obvious, and may confuse future DBAs.

In the bottom diagram all devices have their own subtype (There are more classes of device that are not shown here). In this situation, it is obvious which tables records get inserted to or selected from. Desktop computers and laptops go in Computer, etc. Some pros and cons that I thought of for this setup:

  • Pro: It is immediately obvious which tables to use for CRUD operations for subtypes.
  • Pro: Only have to use one table for CRUD operations.
  • Con: SELECTing records based on common subtype fields requires all tables to be combined, for example searching by Hostname, or LocationID.

In both situations, the ClassDiscriminator field is placed in subtype tables for use with a CHECK constraint to control which types can be inserted.

Are there any recommendations for which design is better, or is it completely a matter of opinion and dependent on the intended purpose of the database?

EDIT: A specific question I have regards the overlapping nature of the table "NetworkDevice". This table is meant to hold network information for any device with a hostname and/or IP address, whether it is a computer, switch, or router. Is the overlapping nature of this table something that could cause problems, or is it okay to implement it this way?

Thank you in advance for any input provided. Please ask if any additional information is needed.

Best Answer

Physical implementation of subtyping in a database is a complex issue. Unless you have a situation where it offers compelling advantages (see below for one or two examples) it adds complexity into implementation while providing relatively little value.

Having done this with really complex subtyping (applicaitons and sentences on a court case management system, disparate combined-risk commercial insurance contract structures) I guess I have some observations on this. Some significant corner cases are:

  • If the total number of database fields across the subtypes is relatively low (say: less than 100) or there is significant commonality between subtypes then splitting the subtypes out into separate physical tables is probably of little value. It will add significant overhead to reporting queries and searches. In most cases it's best to have a single table and manage your subtyping within the application. (Probably the closest to your problem)

  • If your subtyping is very disjoint, and different subtypes have type-dependent data structures hanging off them (i.e. child tables or more complex structures), then subtype tables make sense. In this case, each subtype probably has relatively little commonality within the application (i.e. there is probably a whole subsystem within the application dedicated to that subtype). Most reporting and querying will probably occur within a given sub-type, with cross-type queries mainly being restricted to a handful of common fields. (Court case management system)

  • If you have a large number of subtypes with disparate attributes and/or a requirement to make this configurable then a generic structure and supplementary metadata may be more appropriate. See this SO posting for a rundown on some possible approaches. (Insurance policy administration system)

  • If you have a very large number of fields with little commonality across your sub-types and little requirement to query across sub-type tables (i.e. nothing much in the way of multi-way outer joins against your sub-type tables) then sub-type tables may help to manage the column sprawl. (Pathologically complex version of your problem)

  • Some O/R mappers may only support a particular approach to managing sub-classes.

In most cases physical sub-type tables in a DB schema are a bit of a solution in search of a problem, as they potentially have undesirable side-effects.

In your case, I assume you have a relatively modest number of sub-types and a manageable number of attributes. Your diagram and question don't indicate any intention to hang child tables off the records. I would suggest that you consider going with the first option suggested above and maintaining one table and manage the sub-typing within your application.