Mysql – How to store user’s language knowledge (150+ languages) in thesql DB

database-designMySQL

Users in my website must be able to select languages they know (English, German, etc), so I have a table languages with 150 languages (id, name) and table users (id and a lot of other fields).

Website's visitors must be able to search users by their language knowledge. For example, they might be looking for someone good at English, German and French.

I'm unable to solve this. If I make a table language_knowledge with fields id, language_id and user_id, then I will be able to find users who know one specific language (lets say English), but I need to find users who know, for example, BOTH English and German.

How do I do this?

Best Answer

This is a classic situation where you use a joining table (a fiddle based on this can be found here. It was done much later than the original answer - better candidates for PRIMARY KEY of the language table!.

CREATE TABLE user
(
  user_id INT,
  user_name VARCHAR(25),
  PRIMARY KEY (user_id)
);

CREATE TABLE language
(
  iso_code CHAR(2),  -- or whatever you want the PRIMARY KEY to be.
  language_name VARCHAR(30),
  PRIMARY KEY (iso_code)
);

CREATE TABLE user_language
(
  ul_user_id INT,
  ul_iso_code CHAR(2),
  PRIMARY KEY (ul_user_id, ul_iso_code),
  FOREIGN KEY (ul_user_id) REFERENCES user (user_id),
  FOREIGN KEY (ul_iso_code) REFERENCES language (iso_code)
);  

I forgot to mention that the FOREIGN KEYs must point to a PRIMARY KEY or a UNIQUE KEY (and not some other arbitrary field) in the referenced table.

And for completeness, you should put a PRIMARY KEY on your two joining fields. This will help query response time and avoid multiple entries of the same data. Whatever your app does, always try to enforce RI (Referential Integrity) in the database.

This query shows how to get the user id and name for anyone who knows both English and German.

SELECT ul_user_id
    , user_name
FROM user_language ul 
    INNER JOIN language l ON ul.ul_iso_code = l.iso_code 
    INNER JOIN user u ON u.user_id = ul.ul_user_id
WHERE l.language_name IN (
    'English'
    , 'German'
    )
GROUP BY ul_user_id, user_name 
HAVING COUNT(*) = 2;

The HAVING COUNT(*) = 2 clause limits the results to only people who know both two languages. If the IN (...) clause had 3 different languages, you'd need to modify the HAVING clause to reflect that, like: HAVING COUNT(*) = 3