For example, there are three tables: players
, dungeons
, and instances
.
A player
has many dungeons
and each dungeon
has many instances
, like in a well-known game of World of Warcraft. Each player
would have a dungeon_id
as a foreign key and each dungeon
would have an instance_id
as a foreign key. That would be a sufficient minimalist approach. A player
of each instance
can be accessed through the dungeon
table. This would result in 1 less column, but a bit longer query.
Another approach is to add player_id
to instances
table as a foreign key. Now players can queried directly from instances
table. Shorter query, but one more column for a foreign key.
Which approach is better in the long run? In practice it seems like adding a foreign key everywhere where it is logically possible makes it easier. But then wouldn't it result in tighter coupling and make code harder to refactor?
Best Answer
Ideally Entities should not have foreign keys and should be considered masters. So
player
,dungeon
andinstance
are masters having informations/attributes/fields related to their own features. You can have a separate table for Enitity-Relations. So you would haveplayer_dungeon
anddungeon_instance
(both one to many as per the PS). Here if you want to knowinstances
belonging to aplayer
, you need to join these two tables.Now comes the famous debate: normalisation vs denormalisation. The first approach is by the book and would work. But it is difficult to scale as the tables grow (inspite of having proper indexes). So you try to denormalise the above solution depending on the volume of queries which are happening on these tables.
As you have specified one query here, I am assuming that is a frequent query (bottleneck). If there are many queries of such kind you can have another relationship table
player_instance
. Now we have solved the problem of latency but given birth to a bigger problem:Consistency: Now there are two ways to find the
instances
belonging to aplayer
. One is directly byplayer_instance
table, another by joiningplayer_dungeon
anddungeon_instance
. We must always get same answer from these two approaches. To ensure that we now need to make our writes suffer. How?player_dungeon
table we also need to insert intoplayer_instance
table (for which we have to querydungeon_instance
to see all those instances mapped to that particular dungeon).dungeon_instance
we need to map all the existing players to that instance which we mapped to that particular dungeon. Doesn't sound good, does it?We are not done yet. All of these operations should be atomic (i.e. in a transaction) or else there will be inconsistency for a short while.
Finally we need to make a trade-off. Do you want to make your database reads slow or writes suffer? Do you need to make your system responsive/available or consistent? The decision is yours to make.
Please read CAP theorem for details.