Should I create a extra relation-table when I have two columns in one table referencing to one column in another table

best practicesdatabase-designnormalization

My problem is best explained by example:

Suppose I have a database about chess games and players.

Which one of the following structures should I choose

Option 1)

player (player_id,name... etc)
game (game_id,date... etc)
game_player (game_id,player_id,white)

where game_player.white is True/False depending if the player was playing black or white

Option 2)

game (game_id,black_player,white_player,date... etc)
player (player_id,name... etc)

I think Option 2) is the best because I know that every game has exactly 2 players and one of them is black and one is white. However, somehow the queries in this setup don't feel so natural to me. Even in simple queries I have to join to player table twice and need conditions like game.black_player = player.player_id

Option 1) feels good because it's immediately clear where each column is referencing. However, otherwise I don't like option 1) so much.

  • So which one is the best practice and why? Or is there even better Option 3)?
  • If I choose option 2) how should I name the black_player and white_player columns to be clear that they are referencing to the player table.

Hopefully my question makes sense.

I realize that this is a very basic question but I can't stop thinking about this and didn't find anything by searching.

Best Answer

I'd definitely choose option 2. A game of chess is between exactly two players.

Option 1 allows a game with 0, 1, 2, 3 or more players, while the data model in option 2 (assuming not null on the two player columns) means that you will never have these data inconsistencies. Just remember to also add the check constraint black_player != white_player.