Postgresql – Using JSONB Column or Another Table to Save Relationsips

database-designjsonmany-to-manypostgresql

I tried to search here thoroughly but did not find any answers.

I have a PostgreSQL database, which has two main tables:

documents
users

These two tables have different relations. A user can:

like
bookmark to read later
save

… a document.

The question is how should I save these relations?

In my experience with MySQL, the obvious way was to create tables for these many to many relations, containing user_id and document_id.

But as we are using PostgreSQL and it has the amazing JSON support, we are thinking maybe the better approach is to have a user_document table, which contains user_id, document_id and a JSON column containing all the relations.

The JSON would be something like this:

{
   'follow' : {'date' : 1523517140, 'doesFollow' : 't'}, 
   'bookmark' : {'date' : null, 'doesBookmark' : 'f'},
   ....
}

I have little to zero experience with PostgreSQL and I don't know about the performance of querying on JSONB columns. And I don't know if this approach makes sense at all in PostgreSQL. But it seems OK and if there is nothing wrong with it, maybe it is preferable to the first, normal, approach.

Best Answer

Document types like json or jsonb (or xml or hstore) are convenient to store, well, documents. Ideal for data with varying keys and rarely updated and not-too-complex filter criteria in queries. Manipulating big documents a lot inside the database would be an anti-pattern.

Structured data that is often written in small increments, and probably searched a lot (like in your case most likely) are much more efficient with a normalized design - regarding storage, performance, concurrent write access and data integrity. Hence the advice in the manual:

JSON data is subject to the same concurrency-control considerations as any other data type when stored in a table. Although storing large documents is practicable, keep in mind that any update acquires a row-level lock on the whole row. Consider limiting JSON documents to a manageable size in order to decrease lock contention among updating transactions. Ideally, JSON documents should each represent an atomic datum that business rules dictate cannot reasonably be further subdivided into smaller datums that could be modified independently.

Implement your n:m relationship with one ore more junction tables. You can enforce data integrity with constraints (PK, FK, UNIQUE, CHECK, ..), none of which is easily possible with document types. I would lean towards a single table unless you have substantially differing requirements for "like", "bookmark" etc.

Example:

If you are unsure about the typical layout, here are the basics:

How to implement a many-to-many relationship in PostgreSQL?

Assuming that each of your relation types can only be used once per user/doc combination. For just a hand full of values (3 in your case) I would use a 1-byte "char" column as PK of a lookup table to optimize storage and performance in tables and indexes.

CREATE TABLE reltype (
   reltype "char" PRIMARY KEY
 , relation_type text UNIQUE NOT NULL
);

INSERT INTO reltype(reltype, relation_type) VALUES
   ('l', 'like')
 , ('b', 'bookmark')
 , ('s', 'save');

CREATE TABLE user_doc (
   user_doc_id int PRIMARY KEY GENERATED ALWAYS AS IDENTITY
 , user_id     int REFERENCES users     ON UPDATE CASCADE ON DELETE CASCADE
 , document_id int REFERENCES documents ON UPDATE CASCADE ON DELETE CASCADE
 , reltype     "char" REFERENCES reltype NOT NULL DEFAULT 'l'
 , CONSTRAINT user_document_pkey UNIQUE(user_id, document_id, reltype)
);

You can always export aggregated data as JSON documents. Even have a VIEW or MATERIALIZED VIEW to read like a table. But don't manage relationships inside a single big JSON document.

Figure out type dynamically

This is the more interesting part of your question:

the type of age in the JSON document is number anyway, so why can't PostgreSQL figure out that by itself?

SQL is a strictly typed language, it does not allow the same expression to evaluate to integer in one row and to text in the next. But since you are only interested in the boolean result of the test, you can get around this restriction with a CASE expression that forks depending on the result of jsonb_typeof():

SELECT data->'name'
FROM   persons
WHERE  CASE jsonb_typeof(data->'age')
        WHEN 'number'  THEN (data->>'age')::numeric > '25' -- treated as numeric
        WHEN 'string'  THEN data->>'age' > 'age_level_3'   -- treated as text
        WHEN 'boolean' THEN (data->>'age')::bool           -- use boolean directly (example)
        ELSE FALSE                                         -- remaining: array, object, null
       END;

An untyped string literal to the right of the > operator is coerced to the respective type of the value to the left automatically. If you put a typed value there, the type has to match or you have to cast it explicitly - unless there is adequate implicit cast registered in the system.

If you know that all numeric values are actually integer, you can also:

... (data->>'age')::int > 25 ...

Postgresql – How tonsert into with postgresql and JSONB

Quickly did lots of testing and i got what you wanted.

INSERT INTO product(data1, job_id)
SELECT json_build_object('__COD__',data2->>'cod','__PAGE__',data2->>'page') data1, job_id FROM product_temp;

Your data1 is one column. Tested with PostgreSQL 9.5

Best Answer

Example:

Related Solutions

Postgresql – Querying JSONB in PostgreSQL

Figure out type dynamically

Postgresql – How tonsert into with postgresql and JSONB

Related Question