PostgreSQL – How to Remove Known Elements from JSON[] Array

arrayjsonpostgresqlupdate

I'm facing an issue regarding using the JSON data type in PostgreSQL. I try to achieve storing a Java model denormalized in the DB. The model features lists of complex objects. Thus, I decided to model those as JSON in native PostgreSQL arrays.

This is a stripped down snippet of my table creation statement:

CREATE TABLE test.persons
(
  id UUID,
  firstName TEXT,
  lastName TEXT,
  communicationData JSON[],
  CONSTRAINT pk_person PRIMARY KEY (id)
);

As you can see it is a person featuring a list of communication data objects in JSON. One of such objects might look like this:

{"value" : "03334/254147", "typeId" : "ea4e7d7e-7b87-4628-ba50-6a5f6e63dbf6"}

I can easily append such a JSON object to an array using PostgreSQL's array_append. However, I fail at removing a known value from the array. Consider f.e. this SQL statement:

UPDATE test.persons
SET communicationData = array_remove(
      communicationData, 
      '{"value" : "03334/254147", "typeId" : "ea4e7d7e-7b87-4628-ba50-6a5f6e63dbf6"}'::JSON
    )
WHERE id = 'f671eb6a-d603-11e3-bf6f-07ba007d953d';

This fails with ERROR: could not identify an equality operator for type json. Do you have a hint how I could remove a known value from the JSON array? It would also be possible to remove by position in the array, as I know that one also…

PostgreSQL version is 9.3.4.

Best Answer

jsonb in Postgres 9.4 or later

Consider the jsonb data type in Postgres 9.4 or later. The 'b' at the end stands for 'binary'. Among other things, there is an equality operator (=) for jsonb. Most people will want to switch.

Depesz blog about jsonb.

json

There is no = operator defined for the data type json, because there is no well defined method to establish equality for whole json values. But see below.

You could cast to text and then use the = operator. This is short, but only works if your text representation happens to match. Inherently unreliable, except for corner cases. See:

Or you can unnest the array and use the ->> operator to .. get JSON object field as text and compare individual fields.

Test table

2 rows: first one like in the question, second one with simple values.

CREATE TABLE tbl (
   tbl_id int PRIMARY KEY
 , jar    json[]
);

INSERT INTO t VALUES
   (1, '{"{\"value\" : \"03334/254146\", \"typeId\" : \"ea4e7d7e-7b87-4628-ba50-f5\"}"
        ,"{\"value\" : \"03334/254147\", \"typeId\" : \"ea4e7d7e-7b87-4628-ba50-f6\"}"
        ,"{\"value\" : \"03334/254148\", \"typeId\" : \"ea4e7d7e-7b87-4628-ba50-f7\"}"}')

 , (2, '{"{\"value\" : \"a\", \"typeId\" : \"x\"}"
        ,"{\"value\" : \"b\", \"typeId\" : \"y\"}"
        ,"{\"value\" : \"c\", \"typeId\" : \"z\"}"}');

Demos

Demo 1

You could use array_remove() with text representations (unreliable).

SELECT tbl_id
     , jar, array_length(jar, 1) AS jar_len
     , jar::text[] AS t, array_length(jar::text[], 1) AS t_len
     , array_remove(jar::text[], '{"value" : "03334/254147", "typeId" : "ea4e7d7e-7b87-4628-ba50-f6"}'::text) AS t_result
     , array_remove(jar::text[], '{"value" : "03334/254147", "typeId" : "ea4e7d7e-7b87-4628-ba50-f6"}'::text)::json[] AS j_result
FROM   tbl;
Demo 2

Unnest the array and test fields of individual elements.

SELECT tbl_id, array_agg(j) AS j_new
FROM   tbl, unnest(jar) AS j   -- LATERAL JOIN
WHERE  j->>'value' <> '03334/254146'
AND    j->>'typeId' <> 'ea4e7d7e-7b87-4628-ba50-6a5f6e63dbf5'
GROUP  BY 1;
Demo 3

Alternative test with row type.

SELECT tbl_id, array_agg(j) AS j_new
FROM   tbl, unnest(jar) AS j   -- LATERAL JOIN
WHERE  (j->>'value', j->>'typeId') NOT IN (
         ('03334/254146', 'ea4e7d7e-7b87-4628-ba50-6a5f6e63dbf5')
        ,('a', 'x')
       )
GROUP  BY 1;

UPDATE as requested

Finally, this is how you could implement your UPDATE:

UPDATE tbl t
SET    jar = j.jar
FROM   tbl t1
CROSS  JOIN LATERAL (
   SELECT ARRAY(
      SELECT j
      FROM   unnest(t1.jar) AS j  -- LATERAL JOIN
      WHERE  j->>'value'  <> 'a'
      AND    j->>'typeId' <> 'x'
      ) AS jar
   ) j
WHERE  t1.tbl_id = 2              -- only relevant rows
AND    t1.tbl_id = t.tbl_id;

db<>fiddle here

About the implicit LATERAL JOIN:

About unnesting arrays:

DB design

To simplify your situation consider an normalized schema: a separate table for the json values (instead of the array column), joined in a n:1 relationship to the main table.