Database Design – How to Implement Different Object Types as Similar Tables

database-designperformance

I have a database I am attempting to design. I am trying to determine the best approach to declare similar, but different types of objects, each with their own sets of properties (perhaps some properties identical between objects).

Let's say, for example, I am building a mall web app. I want to set up a facility for shop owners to register their stores, type of store and some specific information for that store. The problem I am faced with is the best approach to model different shop types and how to incorporate a model for data storage based on the type of shop.

To further extend the example with some concrete data (let's just pretend this information is all necessary for storage), what would be the best approach to model something like this:

Thai Fooooood
Shop Type: FoodShop
Most Popular Item: Pad Thai 
Visitors Last Week: 368
Anti-Static Wristband Repair
Shop Type: RepairShop
Current Computer Repairs: 26
Repair Length Estimate (in business hours): 4 
Visitors Last Week: 182
Macroputers
Shop Type: RetailShop
Next Promotion Date: January 2, 2015
Current Week Sales Estimates: 122418
Visitors Last Week: 411

Assuming many other instances of these shop types exist, with each type storing the same data. Does it make more sense to set up a table for each shop type, to use a single table to represent all shop types (using generic data), or is there a better approach?

Best Answer

Assuming that the app will be treating these various shop-types in the same manner for some operations (operations that do not care what type of shop that it is), then my preferred approach is to use a subclass / inheritance model. I have detailed this out in a couple of other answers here:

The basic concept would be as follows (using Microsoft SQL Server T-SQL semantics).

First you need the main entity table. It represents various "types" so those need to be defined in a lookup table (and the lookup table should have a matching enum in the app layer). Please note that I am using a TINYINT here which holds values from 0 to 255. Use the smaller numeric type that will meet your needs.

CREATE TABLE dbo.ShopType
(
  ShopTypeID TINYINT NOT NULL PRIMARY KEY, -- NO IDENTITY / auto-increment!
  ShopType   VARCHAR(50) NOT NULL
);

CREATE TABLE dbo.Shop
(
  ShopID     INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
  ShopTypeID TINYINT NOT NULL REFERENCES dbo.ShopType (ShopTypeID), -- FK to ShopType
  ShopName   NVARCHAR(100) NOT NULL,
  VisitorsLastWeek INT NOT NULL DEFAULT (0),
  -- {OtherCommonFields},...
  CreateDate DATETIME NOT NULL DEFAULT (GETDATE()), -- or GETUTCDATE()
  ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);

Next you create a subclass table for each particular "type". It is sometimes appropriate to share a subclass table for multiple "types", but just be careful when doing that so one type doesn't evolve differently than the other type(s) that might be sharing the same subclass table. Please note that the PK of the subclass tables is the same field as the PK in the parent class (i.e. the [Shop] table) and even FKs to it. The reason is that the row in both tables (e.g. [Shop] and [ShopRepair] that have the same value for [ShopID] are together a complete entity; the row in the subclass table does not represent anything by itself and doesn't need a separate ID.

CREATE TABLE dbo.ShopFood
(
  ShopID INT NOT NULL PRIMARY KEY REFERENCES dbo.Shop(ShopID),
  MostPopularItem  NVARCHAR(50) NULL,
  ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);

CREATE TABLE dbo.ShopRepair
(
  ShopID INT NOT NULL PRIMARY KEY REFERENCES dbo.Shop(ShopID),
  CurrentComputerRepairs SMALLINT NOT NULL DEFAULT (0),
  RepairEstimatedHours TINYINT NOT NULL DEFAULT (0),
  ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);

CREATE TABLE dbo.ShopRetail
(
  ShopID INT NOT NULL PRIMARY KEY REFERENCES dbo.Shop(ShopID),
  NextPromotionDate DATETIME NULL,
  CurrentWeekSalesEstimate MONEY NULL,
  ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);

To simplify SELECT operations, you can create Views against each type, if you like:

CREATE VIEW dbo.FoodShop AS
SELECT sh.ShopID, sh.ShopTypeID, sf.MostPopularItem
FROM   dbo.Shop sh
INNER JOIN dbo.ShopFood sf
        ON sf.ShopID = sh.ShopID;

CREATE VIEW dbo.RepairShop AS
SELECT sh.ShopID, sh.ShopTypeID, sr.CurrentComputerRepairs, sr.RepairEstimatedHours
FROM   dbo.Shop sh
INNER JOIN dbo.ShopRepair sr
        ON sr.ShopID = sh.ShopID;

CREATE VIEW dbo.RetailShop AS
SELECT sh.ShopID, sh.ShopTypeID, sr.NextPromotionDate, sr.CurrentWeekSalesEstimate
FROM   dbo.Shop sh
INNER JOIN dbo.ShopRetail sr
        ON sr.ShopID = sh.ShopID;

EDIT:
I forgot to add in the "audit" fields of [CreateDate] and [ModifiedDate]. I have now added them to the tables above. Please note that:

  • these fields are not in the [ShopType] table as there really is no purpose here, but that is a bit subjective and some people still like to know at least when the records were created, so maybe [CreateDate] is fine there

  • the [CreateDate] field is not in the subclass tables as the assumption is that those rows are created at the same time as the parent record. But when it comes to updates, those might just touch one of the two tables, depending on which fields are being updated. If using SQL Server, it is possible to create the records for both tables in a single statement even, using the OUTPUT clause as follows:

    CREATE PROCEDURE dbo.Shop_CreateRepairShop
    (
      @ShopName NVARCHAR(100),
      @CurrentComputerRepairs SMALLINT,
      @RepairEstimatedHours TINYINT,
      @ShopID INT OUTPUT = -1
    )
    AS
    SET NOCOUNT ON;
    
    INSERT INTO dbo.Shop (ShopName, ShopTypeID)
      OUTPUT INSERTED.ShopID, @CurrentComputerRepairs, @RepairEstimatedHours
      INTO   dbo.ShopRepair (ShopID, CurrentComputerRepairs, RepairEstimatedHours)
      VALUES (@ShopName, 2); -- assuming ShopTypeID for "Repair Shop" = 2
    
    SET @ShopID = SCOPE_IDENTITY();
    
Related Question