jsonb
in Postgres 9.4 or later
Consider the jsonb
data type in Postgres 9.4 or later. The 'b' at the end stands for 'binary'. Among other things, there is an equality operator (=
) for jsonb
. Most people will want to switch.
Depesz blog about jsonb.
json
There is no =
operator defined for the data type json
, because there is no well defined method to establish equality for whole json
values. But see below.
You could cast to text
and then use the =
operator. This is short, but only works if your text representation happens to match. Inherently unreliable, except for corner cases. See:
Or you can unnest
the array and use the ->>
operator to .. get JSON object field as text
and compare individual fields.
Test table
2 rows: first one like in the question, second one with simple values.
CREATE TABLE tbl (
tbl_id int PRIMARY KEY
, jar json[]
);
INSERT INTO t VALUES
(1, '{"{\"value\" : \"03334/254146\", \"typeId\" : \"ea4e7d7e-7b87-4628-ba50-f5\"}"
,"{\"value\" : \"03334/254147\", \"typeId\" : \"ea4e7d7e-7b87-4628-ba50-f6\"}"
,"{\"value\" : \"03334/254148\", \"typeId\" : \"ea4e7d7e-7b87-4628-ba50-f7\"}"}')
, (2, '{"{\"value\" : \"a\", \"typeId\" : \"x\"}"
,"{\"value\" : \"b\", \"typeId\" : \"y\"}"
,"{\"value\" : \"c\", \"typeId\" : \"z\"}"}');
Demos
Demo 1
You could use array_remove()
with text
representations (unreliable).
SELECT tbl_id
, jar, array_length(jar, 1) AS jar_len
, jar::text[] AS t, array_length(jar::text[], 1) AS t_len
, array_remove(jar::text[], '{"value" : "03334/254147", "typeId" : "ea4e7d7e-7b87-4628-ba50-f6"}'::text) AS t_result
, array_remove(jar::text[], '{"value" : "03334/254147", "typeId" : "ea4e7d7e-7b87-4628-ba50-f6"}'::text)::json[] AS j_result
FROM tbl;
Demo 2
Unnest the array and test fields of individual elements.
SELECT tbl_id, array_agg(j) AS j_new
FROM tbl, unnest(jar) AS j -- LATERAL JOIN
WHERE j->>'value' <> '03334/254146'
AND j->>'typeId' <> 'ea4e7d7e-7b87-4628-ba50-6a5f6e63dbf5'
GROUP BY 1;
Demo 3
Alternative test with row type.
SELECT tbl_id, array_agg(j) AS j_new
FROM tbl, unnest(jar) AS j -- LATERAL JOIN
WHERE (j->>'value', j->>'typeId') NOT IN (
('03334/254146', 'ea4e7d7e-7b87-4628-ba50-6a5f6e63dbf5')
,('a', 'x')
)
GROUP BY 1;
UPDATE
as requested
Finally, this is how you could implement your UPDATE
:
UPDATE tbl t
SET jar = j.jar
FROM tbl t1
CROSS JOIN LATERAL (
SELECT ARRAY(
SELECT j
FROM unnest(t1.jar) AS j -- LATERAL JOIN
WHERE j->>'value' <> 'a'
AND j->>'typeId' <> 'x'
) AS jar
) j
WHERE t1.tbl_id = 2 -- only relevant rows
AND t1.tbl_id = t.tbl_id;
db<>fiddle here
About the implicit LATERAL JOIN
:
About unnesting arrays:
DB design
To simplify your situation consider an normalized schema: a separate table for the json
values (instead of the array column), joined in a n:1 relationship to the main table.
If you know the desired column names (be it the same or different from the keys in the JSON structure, you can use json[b]_to_recordset()
:
SELECT * FROM jsonb_to_recordset('[
{ "col1": "a", "col2": 1, "col3": 1, "col4": "one"},
{ "col1": "b", "col2": 2, "col3": 2, "col4": "two"},
{ "col1": "c", "col2": 3, "col3": 3, "col4": "three"},
{ "col1": "d", "col2": 4, "col3": 4, "col4": "four"}
]'::jsonb) AS t (col1 text, col2 integer, col3 integer, col4 text);
col1 │ col2 │ col3 │ col4
──────┼──────┼──────┼───────
a │ 1 │ 1 │ one
b │ 2 │ 2 │ two
c │ 3 │ 3 │ three
d │ 4 │ 4 │ four
As the documentation tells us,
Note: In json_populate_record
, json_populate_recordset
, json_to_record
and json_to_recordset
, type coercion from the JSON is "best effort" and may not result in desired values for some types. JSON keys are matched to identical column names in the target row type. JSON fields that do not appear in the target row type will be omitted from the output, and target columns that do not match any JSON field will simply be NULL
.
If you already have a table to work with, json_populate_recordset()
is an even better solution:
CREATE TABLE inputtable (col1 text, col2 integer, col3 integer, col4 text);
SELECT * FROM jsonb_populate_recordset(NULL::yourtable, '[
{ "col1": "a", "col2": 1, "col3": 1, "col4": "one"},
{ "col1": "b", "col2": 2, "col3": 2, "col4": "two"},
{ "col1": "c", "col2": 3, "col3": 3, "col4": "three"},
{ "col1": "d", "col2": 4, "col3": 4, "col4": "four"}
]'::jsonb);
col1 │ col2 │ col3 │ col4
──────┼──────┼──────┼───────
a │ 1 │ 1 │ one
b │ 2 │ 2 │ two
c │ 3 │ 3 │ three
d │ 4 │ 4 │ four
Now updating the table itself may be done like this:
WITH source AS (SELECT * FROM jsonb_populate_recordset [...])
UPDATE yourtable
SET col1 = s.col1, col2 = s.col2
FROM source AS s
WHERE col3 = s.col3;
In case it seems slow, it might make sense to not use the CTE, but a subquery in the FROM
clause instead.
Best Answer
The data type of the column
people
isjson
, as is the result ofjson_array_elements(people)
. And there is no equality operator (=
) for the data typejson
. So you also cannot runGROUP BY
on it. More:jsonb
has an equality operator, so the "workaround" in your answer is to cast tojsonb
and use the equivalentjsonb_array_elements()
. The cast adds cost:Since Postgres 9.4 we also have
json_array_elements_text(json)
returning array elements astext
. Related:So:
It seems more convenient to get names as
text
instead ofjsonb
objects (double-quoted in text representation) and your "desired output" indicates you want / needtext
in the result to begin with.GROUP BY
ontext
data is also cheaper than onjsonb
, so this alternative "workaround" should be faster for two reasons. (Test withEXPLAIN (ANALYZE, TIMING OFF)
.)For the record, there was nothing wrong with your original answer. The comma (
,
) is just as "correct" asCROSS JOIN LATERAL
. Having been defined earlier in standard SQL does no make it inferior. See:Neither is it more portable to other RDBMS, and since
jsonb_array_elements()
orjson_array_elements_text()
aren't portable to other RDBMS to begin with, that's also irrelevant. The short query does not get any clearer withCROSS JOIN LATERAL
IMO, but the last bit is just my personal opinion.I used the more explicit table and column alias
p(name)
and the table-qualified referencep.name
to defend against possible duplicate names.name
is such a common word, it might also pop up as column name in the underlying tableband
, in which case it would silently resolve toband.name
. The simple formjson_array_elements_text(people) name
only attaches a table alias, the column name is stillvalue
, as returned from the function. Butname
resolves to it's single columnvalue
when used in theSELECT
list. It happens to work as expected. But a true column namename
(ifband.name
should exist) would bind first. While that won't bite in the given example, it can be a a loaded foot gun in other cases.Don't use the generic "name" as identifier to begin with. Maybe that was just for the simple test case.
If the column
people
can hold anything but a plain JSON array, either query would trigger an exception. If you cannot guarantee data integrity, you might want to defend withjson_typeof()
:Excludes violating rows from the query.
Related: