PostgreSQL Database Design – Valid Case for Entity Attribute Values?

database-designpostgresql

Well currently we are not using eav, however our database seems a bit finicky and one suggestion was something along the lines of entity attribute values.

What we have is a table which acts as "translations". It has multiple rows with each line describing a translation. With multiple column, each column being a language.

CREATE TABLE table_name (
    id serial,
    organisation int,
    key text,
    en_gb text,
    nl_nl text,
   ....
);

This works fine (omitted constraints). However one feature is that users need to supply their own translations for data they inserted. In whatever language they wish. (Given the language needs to be formally described in iso 639-1)

This creates the weird situation where upon inserting not only a new row might be added. But the database schema also might be modified: a check is performed if the language column exists, if it is not it is added.

Which also adds a lot of columns (with nulls) for all fields – a minor problem though.

One solution would be to instead of having a column for each language use a json "translations" so that not each field/row needs to have a value for everything. This was dismissed on grounds that the far majority of the time we modify a single field, in an already existing column: apart from 6 or so languages where we operate we do not expect many more to be inserted.

Another solution was to use EAV, with instead of using a single row for all translations from a single field there is a new row for each language/translation of a certain field.

CREATE TABLE table_name (
    id serial,
    organisation int,
    key text,
    language text,
    translation text,
   ....
);

This seems a good case for entity attribute values right? Apart from that the database now is quite convoluted and isn't easily visualized as a neat table. So is it?

The ability to visualize as a neat table I personally like since insertion happens most often by providing a csv, so the visualization mirrors that.


For a given entity and language, could there be more than one translation provided? If so, how would they be distinguished from each other?

The user just inserts translations for organisations. A translation would be:

{
    organisation: 1, 
    key: 'orgname', 
    'en-us': 'somename', 
    'nl-nl': 'dutchname',
    'fr-fr': 'french-organisation-name'
}

Uniqueness is given by (organisation, key) (or when transformed to multiple lines (organisation, key, language)). When multiple users edit, given they have the right to do so, they translate "for everyone", there is no merit of who did the translation, or for whom the translation is.

Best Answer

I think that your proposed approach with a table row per language is fine, but I would definitely have language be a foreign key to a lookup table of languages.

Otherwise you'll end up with languages like English, en, EN or sprfz that people enter on a whim, and you will get complaints why the program does not display the proper English version of the text. Prefill that lookup table with entries for all conceivable languages.