Database Design – Best Practices for Using Foreign Keys

foreign key

For example, there are three tables: players, dungeons, and instances.

A player has many dungeons and each dungeon has many instances, like in a well-known game of World of Warcraft. Each player would have a dungeon_id as a foreign key and each dungeon would have an instance_id as a foreign key. That would be a sufficient minimalist approach. A player of each instance can be accessed through the dungeon table. This would result in 1 less column, but a bit longer query.

Another approach is to add player_id to instances table as a foreign key. Now players can queried directly from instances table. Shorter query, but one more column for a foreign key.

Which approach is better in the long run? In practice it seems like adding a foreign key everywhere where it is logically possible makes it easier. But then wouldn't it result in tighter coupling and make code harder to refactor?

Best Answer

Ideally Entities should not have foreign keys and should be considered masters. So player, dungeon and instance are masters having informations/attributes/fields related to their own features. You can have a separate table for Enitity-Relations. So you would have player_dungeon and dungeon_instance (both one to many as per the PS). Here if you want to know instances belonging to a player, you need to join these two tables.

Now comes the famous debate: normalisation vs denormalisation. The first approach is by the book and would work. But it is difficult to scale as the tables grow (inspite of having proper indexes). So you try to denormalise the above solution depending on the volume of queries which are happening on these tables.

A player of each instance can be accessed through the dungeon table.

As you have specified one query here, I am assuming that is a frequent query (bottleneck). If there are many queries of such kind you can have another relationship table player_instance. Now we have solved the problem of latency but given birth to a bigger problem:

Consistency: Now there are two ways to find the instances belonging to a player. One is directly by player_instance table, another by joining player_dungeon and dungeon_instance. We must always get same answer from these two approaches. To ensure that we now need to make our writes suffer. How?

  • On addition of each player-dungeon mapping, apart from entering to our player_dungeon table we also need to insert into player_instance table (for which we have to query dungeon_instance to see all those instances mapped to that particular dungeon).
  • On addition of a new dungeon_instance we need to map all the existing players to that instance which we mapped to that particular dungeon. Doesn't sound good, does it?

We are not done yet. All of these operations should be atomic (i.e. in a transaction) or else there will be inconsistency for a short while.

Finally we need to make a trade-off. Do you want to make your database reads slow or writes suffer? Do you need to make your system responsive/available or consistent? The decision is yours to make.

Please read CAP theorem for details.