JSONB may be easy to read, but it's complicated and inefficient to write into.
See for example this question: PostgreSQL update and delete property from JSONB column, on how it looks like. It's an order of magnitude harder than an update/delete with classic EAV tables.
Possibly when you'll have written the parts to append/merge/delete key/value pairs, the elegance or simplicity of that JSON-based design are going to seem much less obvious. As for the performance, I'm betting on it being much worse.
At the storage level, any write into one property in a JSON structure will require a rewrite of the entire column (and in fact of the containing row), with the same cost as if all properties changed.
This is not optimal in terms of I/O size and pressure on vacuum.
This problem is mentioned on S.O in another question with some more links and references:
How to perform update operations on columns of type JSONB in Postgres 9.4
There's ongoing work to ease JSONB updates for the programmer. jsonbx provides functions and operators that can help with 9.4; presumably these will be integrated into PostgreSQL core in future versions. But the large I/O cost of a small update inside a large JSON object will remain.
jsonbx
was demonstrated in a recent PG conference (youtube links):
Update and Delete operations for jsonb (part 1 of 2)
Update and Delete operations for jsonb (part 2 of 2)
Just a slight variation to Chris's answer:
SELECT a, translate(b, '[]', '{}')::text[] AS b, d
FROM json_to_record('{"a": 1, "b": ["hello", "There"], "c": "bar"}')
AS x(a int, b text, d text);
The idea is the same: massage the JSON array into an array - in this case, through an array literal. In addition to a bit cleaner looking code (though I love it, regex usually does not help much in this regard :), it seems slighly faster, too:
CREATE TABLE jsonb_test (
id serial,
data jsonb
);
INSERT INTO jsonb_test (id, data)
SELECT i, format('{"a": %s, "b": ["foo", "bar"], "c": "baz"}', i::text)::jsonb
FROM generate_series(1,10000) t(i);
SELECT a, string_to_array(regexp_replace(b, '\[*\"*\s*\]*','','g'),',') AS b, d
FROM jsonb_test AS j,
LATERAL json_to_record(j.data::json) AS r(a int, b text, d text);
-- versus
SELECT a, translate(b, '[]', '{}')::text[] AS b, d
FROM jsonb_test AS j,
LATERAL json_to_record(j.data::json) AS r(a int, b text, d text);
On this dataset and on my test box, the regex version shows and average execution time of 300 ms, while my version shows 210 ms.
Best Answer
I think this can be done using an custom aggregate and existing jsonb functionality:
The following aggregate will "merge" two jsonb values into one. Essentially the same as
a || b
(where a and b are jsonb values) but as an aggregate for multiple rows.With that aggregate it's possible to write a recursive function that removes keys with null values:
The inner select turns a jsonb value into a set of key/value pairs and removes the
null
values recursively. Then the resulting values are aggregated back into a jsonb object.returns:
The result can be cast back to
json
I don't have a Postgres 9.4 installation around for testing, but I think I only used functions available in 9.4.