Mysql – Representing a Superclass/Subclass (or Supertype-Subtype) relationship in a MySQL Workbench diagram

database-designdatabase-diagramserdMySQLsubtypes

Situation

I am trying to figure out how to represent a one-to-one (1:1) relationship regarding a superclass/subclass or supertype-subtype structure in an entity-relationship diagram (ERD) of the kind created by means of MySQL Workbench (which, as you know, looks more close to a concrete SQL implementation than an original P. P. Chen diagram).

I am new to this, so I am not sure whether I have to type out all the attributes1 again in the subclass entities (or subtypes), i.e., MANAGER and DRIVER, or whether I there is a way of connecting to and grabbing the attributes from the superclass (or supertype) so that it is more clear that they relate, something like a FOREIGN KEY, I guess.

1. Each attribute in EMPLOYEE applies to MANAGER and DRIVER, but MANAGER and DRIVER will have some extra attributes that are not characteristics of the EMPLOYEE entity type (or table, once it is implemented).

Current diagram

Bellow is the ERD I have created so far:

Diagram

Question

So, how can I represent accurately this sort of relationship in my diagram?

Best Answer

This is a case of what's called "generalization/specialization" in ER lingo. It's really the same thing as what object models call "superclass/subclass" as you have done. There are two separable issues. The first is how do you want to draw the diagram, and the second is how do you want to design the tables.

As far as drawing the diagram goes, I would draw it the same way you have, with one exception. The lines leading out of Managers and Drivers would not go separately to Employees. Instead, they would go to something I'll call a gen/spec box. The following sample shows one way to depict a gen/spec case. Then a single line connects the gen/spec box with the Employees table. It all depends on which way seems simpler and clearer to you and to others who look at the diagram.

As far as tables go, you could have three tables, just as you have shown. This is sometimes called "class table inheritance". There is a technique you could benefit from, called "shared primary key". In this technique, the manager table and the driver table do not have an id field of their own. Instead, there is a copy of Employee_id in each subclass table, and employee_id is declared to be both the primary key of this table and also a foreign key that references Employees.

Using shared primary key buys you a few things and it costs you something.

It buys you enforcement of the 1:1 limit on manager-employee or driver-employee. It may end up speeding the resulting joins. And it allows a fourth table that uses employee-id as a foreign key to be joined directly to Drivers or Managers. The irrelevant employees drop out of the join.

What it costs you is that you have to obtain and use a copy of the right employee_id every time you go to insert a driver or a manager.

You could also just pack driver attributes and manager attributes into the employee table, leaving irrelevant fields NULL. That results in fewer joins, but it does have a downside. It makes the employee table slightly bigger and slower. And if you use nullable fields in WHERE clauses, you'll have to learn how SQL does three valued logic (True, False, and Unknown). For some people, this is awfully difficult to grasp. I try to avoid three valued logic myself.