Postgresql – In postgres is there any real drawback on using a json field over multiple tables and one to many relationships

database-designpostgresql

In our current application we are basically going to provide something called "translations" for each field. So a table would originally be like:

CREATE TABLE organisation (
   id SERIAL PRIMARY KEY,
   name TEXT NOT NULL DEFAULT '',
)

But instead the name would now be language dependent.

The basic solution would be to no longer store the name field in the organisation table but instead in an "organisation_language" table:

CREATE TABLE organisation (
   id SERIAL PRIMARY,
);  


CREATE TABLE organisation_language
(
  id SERIAL PRIMARY KEY,
  organisation INTEGER NOT NULL,
  field_name TEXT NOT NULL,
  field_language TEXT NOT NULL,
  field_translation TEXT NOT NULL DEFAULT '',
  
  FOREIGN KEY (organisation)
    REFERENCES public.organisation (id) MATCH SIMPLE
      ON UPDATE CASCADE
      ON DELETE CASCADE
      NOT VALID
);

(And yes I'm deliberately ignoring I could add a lookup table for language keys like en-US and refer to that).

However, now I can no longer "guarantee" that every field (like name) is defined. Of course the code could make sure it does but that's another layer. It also adds quite a bit of complexity as it is no longer easy to see what fields even belong to a datamodel. Or what translations are given for a certain organisation.

If I look at the JSON I would actually "like" my backend to send to the frontend upon requesting "data from organisation 1" it would look like:

{
    id: 1
    name: {
        "en-us": 'hello world',
        "nl-nl": 'hoi wereld'
    }
}

On top of that, the main actions that happen are "insertion" – but then insertion of all data at once (so all translations at once), and retrieval of all language data, not ever a single language. (Due to "paths", like if a field doesn't exist in en-gb check en-us, and this calculation happens in backend).

Modification happens so rarely that we are even considering to just "not support it" and instead go for copy-on-write and deactivating the old "organisation".


This made me think, is it a bad idea to change the type of the "name" field to JSON, instead of having a language table? (and do this for all fields). It would then look like:

CREATE TABLE organisation 
(
  id SERIAL PRIMARY KEY,
  name JSON NOT NULL DEFAULT '{}'::JSON
);

And inserted data into the name field would be (backend has to verify the schema):

{
    "en-us": 'hello world',
    "nl-nl": 'hoi wereld'
}

Is there an actual drawback to using JSON fields, as opposed to using a language table? It has the advantage of easier lookup, and insertion (or deletion). And since modification is not supposed to happen anyways there's no drawback – correct?

Or will PostgreSQL choke on this later?

Best Answer

The first normal form requires columns to be single, atomic values. I'd argue that if you really treat these JSON objects as something to store and retrieve, they are atomic from a database point of view, and what you plan to do is fine.

If you want to use parts of the JSON in a WHERE condition, that could still be alright. If you want database constraints based on parts of the JSON, or you want to join based on JSON attribute, I'd advise against using such a data model.