How to map an IS-A relationship into a database

database-design

Consider the following:

entity User
{
    autoincrement uid;
    string(20) name;
    int privilegeLevel;
}

entity DirectLoginUser
{
    inherits User;
    string(20) username;
    string(16) passwordHash;
}

entity OpenIdUser
{
    inherits User;
    //Whatever attributes OpenID needs... I don't know; this is hypothetical
}

The different kinds of users (Direct Login users, and OpenID users) display an IS-A relationship; namely, that both types of users are users. Now, there are several ways this can be represented in an RDBMS:

Way One

CREATE TABLE Users
(
    uid INTEGER AUTO_INCREMENT NOT NULL,
    name VARCHAR(20) NOT NULL,
    privlegeLevel INTEGER NOT NULL,
    type ENUM("DirectLogin", "OpenID") NOT NULL,
    username VARCHAR(20) NULL,
    passwordHash VARCHAR(20) NULL,
    //OpenID Attributes
    PRIMARY_KEY(uid)
)

Way Two

CREATE TABLE Users
(
    uid INTEGER AUTO_INCREMENT NOT NULL,
    name VARCHAR(20) NOT NULL,
    privilegeLevel INTEGER NOT NULL,
    type ENUM("DirectLogin", "OpenID") NOT NULL,
    PRIMARY_KEY(uid)
)

CREATE TABLE DirectLogins
(
    uid INTEGER NOT_NULL,
    username VARCHAR(20) NOT NULL,
    passwordHash VARCHAR(20) NOT NULL,
    PRIMARY_KEY(uid),
    FORIGEN_KEY (uid) REFERENCES Users.uid
)

CREATE TABLE OpenIDLogins
(
    uid INTEGER NOT_NULL,
    // ...
    PRIMARY_KEY(uid),
    FORIGEN_KEY (uid) REFERENCES Users.uid
)

Way Three

CREATE TABLE DirectLoginUsers
(
    uid INTEGER AUTO_INCREMENT NOT NULL,
    name VARCHAR(20) NOT NULL,
    privlegeLevel INTEGER NOT NULL,
    username VARCHAR(20) NOT NULL,
    passwordHash VARCHAR(20) NOT NULL,
    PRIMARY_KEY(uid)
)

CREATE TABLE OpenIDUsers
(
    uid INTEGER AUTO_INCREMENT NOT NULL,
    name VARCHAR(20) NOT NULL,
    privlegeLevel INTEGER NOT NULL,
    //OpenID Attributes
    PRIMARY_KEY(uid)
)

I'm almost certain the third way is the wrong way, because it's not possible to do a simple join against users elsewhere in the database.

My real world example is not a users with different logins example though; I'm interested in how to model this relationship in the general case.

Best Answer

Way two is the correct way.

Your base class gets a table, and then child classes get their own tables with just the additional fields they introduce, plus foreign key references to the base table.

As Joel suggested in his comments on this answer, you can guarantee that a user will have either a direct login or an OpenID login, but not both (and also possibly neither) by adding a type column to each sub-type table that keys back to the root table. The type column in each sub-type table is restricted to have a single value representing the type of that table. Because this column is foreign keyed to the root table, only one sub-type row can link to the same root row at a time.

For example, the MySQL DDL would look something like:

CREATE TABLE Users
(
      uid               INTEGER AUTO_INCREMENT NOT NULL
    , type              ENUM("DirectLogin", "OpenID") NOT NULL
    // ...

    , PRIMARY_KEY(uid)
);

CREATE TABLE DirectLogins
(
      uid               INTEGER NOT_NULL
    , type              ENUM("DirectLogin") NOT NULL
    // ...

    , PRIMARY_KEY(uid)
    , FORIGEN_KEY (uid, type) REFERENCES Users (uid, type)
);

CREATE TABLE OpenIDLogins
(
      uid               INTEGER NOT_NULL
    , type              ENUM("OpenID") NOT NULL
    // ...

    PRIMARY_KEY(uid),
    FORIGEN_KEY (uid, type) REFERENCES Users (uid, type)
);

(On other platforms you would use a CHECK constraint instead of ENUM.) MySQL supports composite foreign keys so this should work for you.

Way one is valid, though you are wasting space in those NULL-able columns because their use depends on the type of user. The advantage is that if you choose to expand what kinds of user types to store and those types don't require additional columns, you can just expand the domain of your ENUM and use the same table.

Way three forces any queries that reference users to check against both tables. This also prevents you from referencing a single users table via foreign key.