Database design – table with multiple nullable foreign keys

database-designdenormalizationnormalization

Context:

  • I have 3 unrelated entities (tables) Entity1, Entity2 and Entity3
  • I have a table called Operation, which can have multiple targets.
  • A target for an operation can be Entity1, Entity2 or Entity3.

To model this structure, I first modeled it like this (Approach 1):

Table OperationTarget

  • FK to Operation (Not Nullable)
  • FK to Entity 1 (Nullable)
  • FK to Entity 2 (Nullable)
  • FK to Entity 3 (Nullable)
  • A constraint that enforces only one of them will be set, and others will be Null

Then, I didn't like the idea of creating a table where at each row, 2 of 3 columns will always be Null, so I created a different approach (Approach 2):

Table OperationTarget

  • FK to Operation (Not Nullable)

Table OperationTargetEntity1

  • FK to OperationTarget(Not Nullable)
  • FK to Entity 1 (Not Nullable)

Table OperationTargetEntity2

  • FK to OperationTarget(Not Nullable)
  • FK to Entity 2 (Not Nullable)

Table OperationTargetEntity3

  • FK to OperationTarget(Not Nullable)
  • FK to Entity 3 (Not Nullable)

This approach feels better in terms of normalization, but in practice, this will make my codebase more complex, and also my queries more complex, as I will be needing joins. I need the parent table there for other application requirements.

I am currently going with approach 2. Is approach 1 something I should be avoiding at all costs, or when justified by simplicity and performance, it would be OK to go with it?

I've been reading on the topic for hours, some people don't like using Nullable columns at all, some say it's OK when application requires them (when it is possible you don't know the value for a column for a specific row, for instance where a user optionally can provide their birth date), but not sure if approach 1 is justifiable here, or should be avoided.

Best Answer

I can't see the entire data model or exactly the context of "Operation" so I can't tell you if there's a better solution, but in general the second approach is preferable, any complexities it creates can be handled by creating a view that creates a simple presentation of the data without requiring someone to even ponder the joins that occur behind the scenes. Any insert/update/delete procedures only need to be programmed once, the additional complexity will be minimal.

Yes, more tables... BUT:

  1. Those tables will be narrow and small.
  2. You would probably build some sort of filtered index on those columns to facilitate searches, and they would look remarkably identical to the tables you've created. But in creating them you give more meaning to them than just breaking rules that exist for a good reason in order to save a few lines of code.
  3. If in the future OperationTarget not has more "targets", much easier to add another table instead of more columns to the base table.

I'd also look harder at this to see if this is a good application for exclusive subtypes, as I said I don't have a full picture of your data model so it would be something you'd have to evaluate on your own.