Mysql – How to reduce table duplication, while ensuring integrity of the database

database-designMySQLschema

For example, in my app I have some class called Contact:

case class Contact(name: String, email: String, phone: String)

And several other classes have one-to-many relation to that contact (for example, there are Companies and Houses).

Currently, I have database table for Contacts set up as something like this:

CREATE TABLE `Contacts` (
 `id` bigint(20) NOT NULL AUTO_INCREMENT,
 `entityId` bigint(20) NOT NULL,
 `entityType` tinyint(4) NOT NULL,
 `name` varchar(300),
 `phone` varchar(300),
 `email` varchar(300),
  PRIMARY KEY (`id`)
);

where entityType is storing the "code" that determines the entity type this contact belongs to (1 for Company, 2 for House, etc), and entityId points to the row in Companies or Houses table. Companies and Houses tables are defined as follows:

CREATE TABLE `Companies` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `company_name` varchar(300) DEFAULT NULL,
  PRIMARY KEY (`id`)
);
CREATE TABLE `Houses` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `address` varchar(300) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

So, when I need to get all contacts for some Company (which has id = 1234 in this example), I execute the following query:

select * from Contacts c where c.entityType = 1 and c.entityId = 1234;

Now, I understand that this is awful – at the very least, I can't create a foreign key on entityId, thus I can't ensure that my data is consistent.

The alternative will be to create a separate Contacts table for each entity type (CompanyContacts, HouseContacts, etc) but that will result in lots of duplication in both database and application code.

Is there a better way to do this?

Best Answer

I might suggest to reorganize schema, to make it more scalable.

  1. DROP columns entityId and entityType.
  2. CREATE an Objects (or Entities) table which will store public / shared information for all objects / entities. It's primary key will migrate to Companies and to Houses with 1:1 relation. For one object might exist only one company OR only one house. Also, you will have to remove AUTO_INCREMENT from Companies and Houses, because in this case they will no longer generate identifiers themselves.
  3. Add foreign key to Contacts table from shared Objects table - contacts might exist for any type of object.

Structure might look like this:

Schema

You might obtain entity type by following query:

SELECT
    cnt.*,
    CASE
        WHEN NOT cmp.id IS NULL THEN 1
        WHEN NOT hs.id IS NULL THEN 2
    END as entityType
FROM
    Contacts as cnt
        INNER JOIN Objects as obj ON cnt.objectId = obj.id
        LEFT JOIN Companies as cmp ON cmp.id = obj.id
        LEFT JOIN Houses as hs ON cmp.id = hs.id

Also, note that entityId becomes objectId in this case.


Here is a CREATE-script for benchmarking and test:

CREATE TABLE IF NOT EXISTS `Objects` (
  `id` BIGINT NOT NULL AUTO_INCREMENT,
  `shared_property1` VARCHAR(45) NULL,
  `shared_property2` VARCHAR(45) NULL,
  `etc` VARCHAR(45) NULL,

  PRIMARY KEY (`id`))
ENGINE = InnoDB;

CREATE TABLE IF NOT EXISTS `Companies` (
  `id` BIGINT NOT NULL,
  `company_name` VARCHAR(300) NULL,
  `private_property1` VARCHAR(45) NULL,
  `private_property2` VARCHAR(45) NULL,
  `etc` VARCHAR(45) NULL,

  PRIMARY KEY (`id`),

  CONSTRAINT `FK_Companies_Objects`
    FOREIGN KEY (`id`)
    REFERENCES `Objects` (`id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

CREATE TABLE IF NOT EXISTS `Houses` (
  `id` BIGINT NOT NULL,
  `address` VARCHAR(300) NULL,
  `private_property1` VARCHAR(45) NULL,
  `private_property2` VARCHAR(45) NULL,
  `etc` VARCHAR(45) NULL,

  PRIMARY KEY (`id`),

  CONSTRAINT `FK_Houses_Objects`
    FOREIGN KEY (`id`)
    REFERENCES `Objects` (`id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

CREATE TABLE IF NOT EXISTS `Contacts` (
  `id` BIGINT NOT NULL AUTO_INCREMENT,
  `objectId` BIGINT NOT NULL,
  `name` VARCHAR(300) NULL,
  `phone` VARCHAR(300) NULL,
  `email` VARCHAR(300) NULL,

  PRIMARY KEY (`id`),

  INDEX `FK_Contacts_Objects` (`objectId` ASC),

  CONSTRAINT `FK_Contacts_Objects`
    FOREIGN KEY (`objectId`)
    REFERENCES `Objects` (`id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;