Database Design – Alternatives for Multiple Foreign Keys

database-theoryforeign keynormalization

Say there are entities called singulars, and entities called relationships.

It takes exactly two singulars to make up a relationships entity.

That pair of singulars can't be repeated elsewhere in relationships, in any order.

One way to model it could be this way:

+----------------+
|relationships   |         +----------+
+----------------+         |singulars |
|id              |         +----------+
|singular_id_1   <---------+id        |
|singular_id_2   <---+     |attribute1|
|pair_description|         |attribute2|
|pair_date       |         |          |
|                |         +----------+
+----------------+

With this pattern, it becomes necessary to check both foreign key fields in relationships for the existence of singulars, which could be on either side. The order doesn't matter, yet it is defined in the schema… So the queries end up with a number of AND/OR groups and cases.

Expanding on that approach could be to store two records for every pair, with the singular_id_[n] swapped on both sides. While that solves some querying complexities, it would introduce additional complexities to make it infeasible.

Using an intermediate table seems like one potential solution:

+----------------+       +-----------------------+
|relationships   |       |singulars_relationships|        +----------+
+----------------+       +-----------------------+        |singulars |
|id              <-------+relationship_id        |        +----------+
|pair_description|       |singular_id            +-------->id        |
|pair_date       |       |                       |        |attribute1|
|                |       +-----------------------+        |attribute2|
+----------------+                                        |          |
                                                          +----------+

So the records might end up something like this:

+----------------------------------+
|relationships                     |
+----------------------------------+
|id   pair_description   pair_date |
+----------------------------------+
|1    Fizz buzz blitz    2022-02-20|
|2    Blitz buzz fizz    2022-02-22|
+----------------------------------+

+----------------------------------+
|singulars_relationships           |
+----------------------------------+
|relationship_id   singular_id     |
+----------------------------------+
|1                 1               |
|1                 2               |
|2                 3               |
|2                 4               |
+----------------------------------+

+-----------------------------+
|singulars                    |
+-----------------------------+
|id   attribute1   attribute2 |
+-----------------------------+
|1    Fizz         Blitz      |
|2    Buzz         Foo        |
|3    Bar          World      |
|4    Blorg        Hello      |
+-----------------------------+

There, singulars_relationships is where the pairs are defined. If a singular_id exists in there, it is already in a pair. One problem that may arise with this pattern could be that three or more singular_id s could end up associated with a relationship_id, and the "exactly n" constraint would then be compromised.

Are there official terms for this type of scenario? And other theory and alternatives?

Best Answer

Are there official terms for this type of scenario?

Yes. This is a Symmetric Relation. And "relation" here has the same meaning as in "Relational Database". An RDBMS is a database management system designed around storing relations. However RDBMSs don't have a native way to store symmetric relations. You either have to store both tuples, eg (a,b) and (b,a) as separate rows, or you have to use some sort of convention to store only one tuple. A common approach is to use a check constraint on the FKs.

eg

check (singular_id_1 < singular_id_2)

Assuming the relation is anti-reflexive.