Efficient Database schema for multi-lingual content

application-designdatabase-designschema

Assume I have a table for blog posts, posts, and schema as

id
author
title
content

and each post can belonged to some categories

id
name
descrtiption

So a blog post can belong to multiple categories.

Obviously title and content in posts, and name and description in categories can be translated to different languages.

So, given the existing db schema design, what would be the most efficient and reasonable way to extend the support for a multi-lingual application?

Best Answer

The least work would be to put a language code in each table. The problem then is there is no way to show that, say, post 612 (English) is a translation of post 943 (Spanish).

To achieve that level of cross-reference it will be necessary to separate the post's existence from its content. There will be these tables:

Language
  Language_id
  Name

Post
  Post_id

Category
  Category_id

PostCategory
  Post_id
  Category_id

CategoryLanguage
  Category_id
  Language_id
  Name
  Description
  ...

PostLanguage
  Post_id
  Language_id
  Title
  Content
  ...

Ideally we would want both a post and its categories translated at the same time. It is difficult to enforce the existance of both simultaneously using declarative constraints. If this is a requirement it would be simpler to push enforcement into the application.

When displaying a post the corresponding categories should be those in the same language. To this end each user could be allowed to choose a language. There will need to be rules to determine what happens if an item has not been translated to the user's preferred language. Maybe a system default language is defined?