I tried to search here thoroughly but did not find any answers.
I have a PostgreSQL database, which has two main tables:
- documents
- users
These two tables have different relations. A user can:
- like
- bookmark to read later
- save
… a document.
The question is how should I save these relations?
In my experience with MySQL, the obvious way was to create tables for these many to many relations, containing user_id
and document_id
.
But as we are using PostgreSQL and it has the amazing JSON support, we are thinking maybe the better approach is to have a user_document
table, which contains user_id
, document_id
and a JSON column containing all the relations.
The JSON would be something like this:
{
'follow' : {'date' : 1523517140, 'doesFollow' : 't'},
'bookmark' : {'date' : null, 'doesBookmark' : 'f'},
....
}
I have little to zero experience with PostgreSQL and I don't know about the performance of querying on JSONB columns. And I don't know if this approach makes sense at all in PostgreSQL. But it seems OK and if there is nothing wrong with it, maybe it is preferable to the first, normal, approach.
Best Answer
Document types like
json
orjsonb
(orxml
orhstore
) are convenient to store, well, documents. Ideal for data with varying keys and rarely updated and not-too-complex filter criteria in queries. Manipulating big documents a lot inside the database would be an anti-pattern.Structured data that is often written in small increments, and probably searched a lot (like in your case most likely) are much more efficient with a normalized design - regarding storage, performance, concurrent write access and data integrity. Hence the advice in the manual:
Implement your n:m relationship with one ore more junction tables. You can enforce data integrity with constraints (PK, FK, UNIQUE, CHECK, ..), none of which is easily possible with document types. I would lean towards a single table unless you have substantially differing requirements for "like", "bookmark" etc.
Example:
If you are unsure about the typical layout, here are the basics:
Assuming that each of your relation types can only be used once per user/doc combination. For just a hand full of values (3 in your case) I would use a 1-byte
"char"
column as PK of a lookup table to optimize storage and performance in tables and indexes.You can always export aggregated data as JSON documents. Even have a
VIEW
orMATERIALIZED VIEW
to read like a table. But don't manage relationships inside a single big JSON document.