Database design – table with multiple nullable foreign keys

database-designdenormalizationnormalization

Context:

I have 3 unrelated entities (tables) Entity1, Entity2 and Entity3
I have a table called Operation, which can have multiple targets.
A target for an operation can be Entity1, Entity2 or Entity3.

To model this structure, I first modeled it like this (Approach 1):

Table OperationTarget

FK to Operation (Not Nullable)
FK to Entity 1 (Nullable)
FK to Entity 2 (Nullable)
FK to Entity 3 (Nullable)
A constraint that enforces only one of them will be set, and others will be Null

Then, I didn't like the idea of creating a table where at each row, 2 of 3 columns will always be Null, so I created a different approach (Approach 2):

Table OperationTarget

FK to Operation (Not Nullable)

Table OperationTargetEntity1

FK to OperationTarget(Not Nullable)
FK to Entity 1 (Not Nullable)

Table OperationTargetEntity2

FK to OperationTarget(Not Nullable)
FK to Entity 2 (Not Nullable)

Table OperationTargetEntity3

FK to OperationTarget(Not Nullable)
FK to Entity 3 (Not Nullable)

This approach feels better in terms of normalization, but in practice, this will make my codebase more complex, and also my queries more complex, as I will be needing joins. I need the parent table there for other application requirements.

I am currently going with approach 2. Is approach 1 something I should be avoiding at all costs, or when justified by simplicity and performance, it would be OK to go with it?

I've been reading on the topic for hours, some people don't like using Nullable columns at all, some say it's OK when application requires them (when it is possible you don't know the value for a column for a specific row, for instance where a user optionally can provide their birth date), but not sure if approach 1 is justifiable here, or should be avoided.

Best Answer

I can't see the entire data model or exactly the context of "Operation" so I can't tell you if there's a better solution, but in general the second approach is preferable, any complexities it creates can be handled by creating a view that creates a simple presentation of the data without requiring someone to even ponder the joins that occur behind the scenes. Any insert/update/delete procedures only need to be programmed once, the additional complexity will be minimal.

Yes, more tables... BUT:

Those tables will be narrow and small.
You would probably build some sort of filtered index on those columns to facilitate searches, and they would look remarkably identical to the tables you've created. But in creating them you give more meaning to them than just breaking rules that exist for a good reason in order to save a few lines of code.
If in the future OperationTarget not has more "targets", much easier to add another table instead of more columns to the base table.

I'd also look harder at this to see if this is a good application for exclusive subtypes, as I said I don't have a full picture of your data model so it would be something you'd have to evaluate on your own.

Related Solutions

Design Tagging module in the database

What you describe in your second paragraph sounds like the Entity Attribute Value pattern.

This is a known pattern for dealing with the type of data you are talking about. When you need to be able to be extensible, drive the types from the data and allow you to start tracking new types as you grow. It allows you stay flexible and agile and be a little more forgiving and generic in your database design and schema.

This pattern works, even in relational databases, but it does take more work to get good performance, and can cause you headaches down the line depending on how many rows, how many inserts you are doing and how frequently you query it. This SO Question talks about some of the pitfalls nicely in the firs answer listed with 20 votes. I have seen this model work well, especially in environments where the new types and entities really do come in and we want to let that be more application driven or user driven but we were fighting with performance more often in that model than a traditional relational database model but we needed the flexibility and we made it work. Definitely look at that SO question and do some more research on EAV and whatever DBMS you are using to find examples from others who have tried the same.

Help with good RDBMS transactional schema design that mimicks a sports league

The basic idea is to separate hierarchy of an organizational chart from actuals teams.

The Level table looks like

LevelID | LevelType
-------------------
  1      League 
  2      Division
  3      Conference
  4      Team_Slot

Team_Slot is a "pigeon-hole" for an actual team to fill-in over time.

enter image description here

Few more notes:

Alternate Key (PositionID, LevelID) is propagated into TeamPosition in order to allow a check constraint LevelID = 4 -- actual teams can fill-in only Team_Slots.
Team can not change Team_Slot within one season.
For simplicity, the hierarchy is here modelled with the levelled-adjacency-list; not most efficient for SQL hierarchies, but good enough for an explanation.
For more efficient hierarchy models, see Celko's book or just google 'SQL hierarchies'.

Best Answer

Related Solutions

Design Tagging module in the database

Help with good RDBMS transactional schema design that mimicks a sports league

Related Question