Entity ID field that can refer to multiple tables

database-design

In designing both audit tables and notification tables, I've run into an issue where I need a row to refer to an entity in another table, but which table that entity is isn't always the same. For example:

  • Each time a user makes a post or uploads a new photo to their album, a notification is created. If the notification is for a post, the notification entry must refer to the posts table. If it is for a photo, the notification entry must refer to the photos table.
  • Moderators may edit posts, delete posts, or block users. The audit_actions entry must refer to the entity that was acted upon in the corresponding table.

My original approach looked like this (this is heavily simplified, not the actual table structure):

TABLE notification_types:
     id         INT
     type_id    INT
     entity_id  INT
  • If type_id was 1, entity_id would refer to an entry in the posts table
  • If type_id was 2, entity_id would refer to an entry in the photos table
  • etc

I was starting to use a similar scheme for audit tables. My plan was to make an audit_history table for each auditable table that mirrored the original table (e.g. posts and post_audit_history), and maintain a single audit_actions table that linked an actor and action to each history item

TABLE audit_actions
     admin_id        INT
     action_id       INT
     entity_id       INT
     audit_entry_id  INT

If action_id was 1 ("edited post"), entity_id would refer to an entry in posts, audit_entry_id would refer to an entry in posts_audit_history, etc


Of course, this approach is terrible. entity_id can refer to multiple tables, which is confusing and prevents it from being a foreign key. There is no way to enforce integrity without complicated triggers.

I've been trying researching and trying to find solutions but I am stuck. Some approaches I've considered:

  • Maintain an entity_ids table. Each entity in every table then has a foreign key to an entry in the entity_ids table; in other words, every single entity in any table has a globally unique ID. This maintains integrity, but when we looked up an item by its entity_id we'd have to search multiple tables to find the entity with that ID
  • Maintain separate notification tables for each type of notification, and separate audit tables for each type of auditable table. E.g. post_notifications, photo_notifications, post_audit_actions, profile_audit_actions, etc. This would require running messy and expensive joins for what were formerly simple queries like 'find all notifications for user X' or 'get all audit records for user Y' or 'get the 100 most recent audit records'

Which approach is best? Is there a better strategy that I haven't thought of here?

Best Answer

One idea comes from extending the analogy of interfaces in object-oriented programming. You can have an IGeneratesNotification "interface" entity that the Posts and Photos entities "implement", as well as an IAuditable "interface" entity that the Posts and Users entities "implement". It would be possible for an entity to implement multiple interface entities, such as in the case of Posts. Here's a diagram of how this can be done.

enter image description here

Note that the fields containing references to the interface entities would be unique.

Advantages:

  • You can enforce referential integrity to the interface entity that can be implemented by one of several entities.
  • If there are fields common to all implementing entities, they can be abstracted out to the interface entity.

Disadvantages:

  • It is possible for the id of an interface entity to be present in 0, or more than one, implementing entities.
  • The service layer would be required to determine which entity the notification or audit action applies to before knowing which table to query.