Postgresql – Deconstruct json and insert/update it into a different table in an updatable view

postgresqltriggerview

I have a view that essentially presents a denormalized view onto my data.

It's roughly like having a blog table and a comments table, and the view then shows a blog post per row, with all the comments belonging to that post in an array in a json field.

The view uses the following code to achieve that:

  (SELECT array_to_json(array_agg(row_to_json(x.*))) AS array_to_json  FROM
     (SELECT foo, bar
      FROM table_b
      WHERE a.id = table_b.abc_id) x) AS baz

I've made the view updatable, which works for everything except this kind of column. I also have some typical many-to-many relationships in that data, and those are automatically updated when you edit them via the view. But I couldn't figure out how to do the same for this pseudo-json column.

Essentially, I need to pass this json to an update/insert statement in my INSTEAD OF trigger. I experimented with json_to_recordset, but I couldn't get that to do what I need.

Any idea how I could achieve this?

Best Answer

What you're trying to achieve is something quite uncommon. However, it is feasible.

Assumptions:

You have a certain parent/child table relationship. I call the two tables parent and child, and assume the following structure:

-- Everything in its own schema
CREATE SCHEMA json_parent_child ;
SET search_path = json_parent_child ;

CREATE TABLE parent
(
   parent_id INTEGER PRIMARY KEY, 
   some_payload TEXT
) ;

CREATE TABLE child
(
   parent_id integer NOT NULL REFERENCES parent(parent_id),
   child_nr integer NOT NULL default 0,
   some_more_payload TEXT,
   PRIMARY KEY(parent_id, child_nr)
) ;

NOTE: The primary key of the child table is NOT an arbitrary child_id field, but a (let's call it) natural key (parent_id, child_nr), that reads like '1st child of parent', '2nd child of parent', ... , 'nth child of parent'.

For the sake of completeness, I fill the tables with some values:

INSERT INTO parent (parent_id, some_payload)
VALUES
  (1, 'parent-1'),
  (2, 'parent-2'),
  (3, 'parent-3'),
  (4, 'parent-4') ;

INSERT INTO child (parent_id, child_nr, some_more_payload)
VALUES
  (1, 1, 'payload-1 of parent-1'),
  (1, 2, 'payload-2 of parent-1'),
  (1, 3, 'payload-3 of parent-1'),
  (1, 4, 'payload-4 of parent-1'),

  (3, 1, 'payload-1 of parent-3') ;

At this point, we create a VIEW which shows all the information from the parent table, and also that of the children, in an aggregate form: a JSON column containing an array of JSON objects:

CREATE OR REPLACE VIEW all_together AS
SELECT
    parent.*, 
    json_agg(row(child.some_more_payload) ORDER BY child.child_nr) 
         AS more_payload
FROM
    parent 
    LEFT JOIN child USING(parent_id)
GROUP BY
    parent.parent_id
ORDER BY
    parent.parent_id ;

At this point, this is the data shown by our view:

SELECT * FROM all_together ;

+------------+--------------+--------------------------------------+
|  parent_id | some_payload |            more_payload              |
+------------+--------------+--------------------------------------+
|          1 |   parent-1   | [{"f1":"payload-1 of parent-1"},     |
|            |              | {"f1":"payload-2 of parent-1"},      |
|            |              | {"f1":"payload-3 of parent-1"},      |
|            |              | {"f1":"payload-4 of parent-1"}]      |
+------------+--------------+--------------------------------------+
|          2 |   parent-2   |  [{"f1":null}]                       | 
+------------+--------------+--------------------------------------+
|          3 |   parent-3   |  [{"f1":"payload-1 of parent-3"}]    | 
+------------+--------------+--------------------------------------+ 
|          4 |   parent-4   |  [{"f1":null}]                       | 
+------------+--------------+--------------------------------------+ 

(The way that rows with no children are shown can be improved. This is first concept.)

We can now define a trigger function to handle the updates of this view:

CREATE FUNCTION update_all_together() RETURNS trigger AS
$BODY$
BEGIN
    -- UPDATE the base table 'parent'
    UPDATE 
        json_parent_child.parent
    SET 
        parent_id = new.parent_id,
        some_payload = new.some_payload 
    WHERE 
        parent_id = old.parent_id ;

    -- UPDATE (actually, DELETE and REINSERT) the children
    -- Can be optimised to really only update what has changed
    DELETE FROM
        json_parent_child.child
    WHERE
        parent_id = old.parent_id ;

    INSERT INTO
        json_parent_child.child
        (parent_id, child_nr, some_more_payload)
    SELECT 
        new.parent_id, row_number() over (), x.f1
    FROM 
        json_to_recordset(new.more_payload) AS x(f1 text) 
    WHERE
        x.f1 IS NOT NULL;

    RETURN new ;
END ;
$BODY$
LANGUAGE plpgsql ;

and link the trigger function to an INSTEAD trigger:

CREATE TRIGGER all_together_update_trigger
  INSTEAD OF UPDATE
  ON json_parent_child.all_together
  FOR EACH ROW
  EXECUTE PROCEDURE json_parent_child.update_all_together();

When you perform an UPDATE of the view:

UPDATE 
    all_together
SET
    some_payload = 'updated 1',
    more_payload = 
'[
  {"f1":"new_payload-1 of parent-1"}, 
  {"f1":"new_payload-2 of parent-1"}, 
  {"f1":"new_payload-3 of parent-1"}, 
  {"f1":"new_payload-4 of parent-1"}
]' 
WHERE
    parent_id = 1 ;

This is what you get after:

  SELECT * FROM all_together;

  +------------+--------------+--------------------------------------+  
  |  parent_id | some_payload |            more_payload              |  
  +------------+--------------+--------------------------------------+  
  |          1 |   updated 1  | [{"f1":"new_payload-1 of parent-1"}, |  
  |            |              | {"f1":"new_payload-2 of parent-1"},  |  
  |            |              | {"f1":"new_payload-3 of parent-1"},  |  
  |            |              | {"f1":"new_payload-4 of parent-1"}]  |  
  +------------+--------------+--------------------------------------+  
  |          2 |   parent-2   |  [{"f1":null}]                       |  
  +------------+--------------+--------------------------------------+  
  |          3 |   parent-3   |  [{"f1":"payload-1 of parent-3"}]    |  
  +------------+--------------+--------------------------------------+   
  |          4 |   parent-4   |  [{"f1":null}]                       |  
  +------------+--------------+--------------------------------------+   

The two little tricks, that basically follow the idea of the OP, with some small variations:

  1. Use json_agg(row(child.some_more_payload) ORDER BY child.child_nr) to aggregate all children into a JSON array (of JSON objects), keeping order. It is necessary to have this data type to allow the next function to give back the rows.

  2. At the trigger, get back to a recordset by using json_to_recordset(new.more_payload) AS x(f1 text), and INSERT this recordset into the child table.

NOTE: I've not checked the 'null' cases, but it just a matter of adding some extra conditions to the trigger, and/or having some strategic COALESCE and/or NULLIF in the appropriate places.