MySQL Database Design – Designing a Relational Database with Interrelated Words

database-designMySQL

I'm developing a simple language game about antonyms, synonyms and 'similar speaking' words.

If you play in the 'Synonyms screen', the game will show you two synonyms and two unrelated words and you have to guess wich are the synonyms.
If you play in the 'Antonyms screen', the game will show you two antonyms and two unrelated words and you have to guess wich are the antonyms.
If you play in the 'Speaking screen', the game will show you two 'similar speaking' words, will speak one of each and you have to guess wich is.

My problem is that you can have more than one synonyms, like:

angry: enraged, furious, irated, offended, etc..
love: affection, passion, devotion, etc..

My first idea is something like this one:

Table Synonyms:

+-----------+-------------------+  
|   word    |     related       |  
+-----------+-------------------+  
| angry     |   enraged         |  
| angry     |   furious         |  
| angry     |   irated          |  
| angry     |   offended        |  
| enraged   |   furious         |  
| enraged   |   irated          |  
| enraged   |   offended        |   
| furious   |   irated          |  
| furious   |   offended        |   
| irated    |   offended        |   
| love      |   affection       |  
| love      |   passion         |  
| love      |   devotion        |  
| affection |   pasion          |  
| ....      |   ....            |  
+-----------+-------------------+  

And build the antonyms and speaking tables in the same way.

But it seems like a lot of work and it's difficult to add/remove one synonym.

Also, I don't know where I can put the information about how to speak every word. Maybe another table like:

+-----------+-------------------+  
|   word    |  pronuciation     |  
+-----------+-------------------+  
| angry     |   ˈaNGgrē         |  
| enraged   |   enˈrājd         |  
| ....      |   ....            |  
+-----------+-------------------+  

Best Answer

The associative entity idea is a good way to handle this, but it has a drawback that you might want to consider.

Something that hasn't been addressed in some other answers is how you're going to maintain all of this data. It's going to be a lot of work, particularly if you want to be able to get from one word to any of its synonyms (or antonyms) without knowing which of your words you're starting from. Consider your example of {angry, enraged, furious, irritated, offended}. That's a set of five synonyms. If you used associative entities you'd have these pairs: angry=enraged, angry=furious,... etc. You'd also have to do them the other way around: enraged=angry, furious=angry, and so forth.

Five synonyms isn't a lot. However, if you make an associative table for 5 synonyms you're going to have 10 entries. For 6 synonyms you'll have 15 and for 7 synonyms you'll have 21. You can see how this might get out of hand.

Alternative: Association via a root The good news is that there's another way to do this. If you pick one of your synonyms and make it the base or root word (say in your case "angry") then you can have one record for each of these words in one table (five synonyms = five records, include "angry" in your list of words) and then a second table containing just the roots (for example "angry")

The key (pun intended) is to have the list of words point at their base or root word. This would be via a foreign key.

So in your word table, you might have something like this:

  • angry = angry

  • enraged = angry

  • furious = angry

  • irritated = angry

  • offended = angry

Now, you might want to use a meaningless numerical key instead of "angry" in the second column in each of the above. That might make things easier for a variety of practical reasons. In that case you'd have a second table with just the base words with their meaningless numerical keys, for example:

  • 123, angry

So your word table looks more like this:

  • angry, 123

  • enraged, 123

  • furious, 123

  • irritated, 123

  • offended, 123

So how do you find synonyms for a word?

That's a simple process, but it's a two step* process:

  1. Look up the root key for the given word. Say you want to find synonyms for "furious". You look it up and find the root is number 123.

  2. Look for all the words that have the same root key as your word, these are your synonyms.

The advantage of doing it this way is that it saves you a ton of data maintenance. You only have to maintain one record for each word, plus one for each group of synonyms. That's a lot fewer than combinations of all synonyms, which is what you're looking at if you do this with an associative entity.

*Note: you don't actually have to do this in two steps. You can write a SQL which combines all of this into a single statement using a self join.