Update n:m Relation in View as Array (Operations) – PostgreSQL

concurrencylockingmany-to-manypostgresql

Environment

I came up with a view and trigger based solution to modify rows in multiple underlying tables in a Postgres 9.5.3 database. This works reasonable using Qt's model view programming. The next task I have to solve is mapping n:m relations which refer to the mentioned objects.

Tables

CREATE TABLE art (
  id bigserial NOT NULL PRIMARY KEY,
  name character varying(40),
  -- more columns, irrelevant to the problem
);

CREATE TABLE tag (
  id bigserial NOT NULL PRIMARY KEY,
  name character varying(40) NOT NULL UNIQUE,
  useable boolean NOT NULL DEFAULT false
);

CREATE TABLE art_tag (
  article_ref bigint NOT NULL,
  tag_ref bigint NOT NULL,
  CONSTRAINT art_tag_pk PRIMARY KEY (article_ref, tag_ref),
  CONSTRAINT art_tag_article_fk FOREIGN KEY (article_ref)
      REFERENCES art (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE,
  CONSTRAINT art_tag_tag_fk FOREIGN KEY (tag_ref)
      REFERENCES tag (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE
);

The view is populated with every relevant column of the main table art (articles) and another column with all tag IDs as array – aggregated with array_agg(<column>) AS <alias>.

Whenever I edit a view row I have to handle the array which resolves to the surrogate key table.

Required Operation

I was looking for a kind of array subtraction which can be explained by the following examples.

{1,2,4} - {1,2,3} = {4}
{1,2,3} - {1,2,4} = {3}

Solution

This code could be used inside the triggers (remove the clutter beforehand) to calculate all elements which have to be deleted from or inserted into the surrogate key table.

DO
$do$
DECLARE
  n bigint[] := array[1,2,3];
  o bigint[] := array[1,2,4];
  i integer;
BEGIN
  FOREACH i IN ARRAY n
  LOOP
    IF NOT (SELECT i = ANY (o)) THEN
      RAISE NOTICE 'insert: %', i; -- execute insert query
    END IF;
  END LOOP;
  FOREACH i IN ARRAY o
  LOOP
    IF NOT (SELECT i = ANY (n)) THEN
      RAISE NOTICE 'delete: %', i; -- execute delete query
    END IF;
  END LOOP;
END
$do$

resulting in:

NOTICE:  insert: 3
NOTICE:  delete: 4

which perfectly fits the expected output. Replacing the NOTICE with a suitable query should suffice. As the trigger always handles exactly one view row the second referenced id, not contained in the array, is known to the trigger function. So following the example code I got two pairs

(1,3) and (1,4)

where 1 represents the article id. The first pair is the one to insert into the surrogate key table and the second one is the one to delete.

Question

Will that just work or is it even an appropriate solution to the task? Is there any other, maybe better way to achieve this? (Concurrency should be assured.)


Qt's model view programming works pretty simple if exactly one table is affected. Otherwise I have to code more stuff. The chosen solution suites me because I like to put as much as possible into the database.

Best Answer

I'll assume a single given art.id at a time.

Since there is no provision to explicitly lock not-yet-existing rows ("value locking") in Postgres, it's hard to defend against concurrent INSERT commands in the default isolation level READ COMMITTED. The more expensive isolation level SERIALIZABLE should solve this, I didn't take a closer look, I think I have a better idea.

A "brute force" approach in READ COMMITTED would be to lock the whole table art_tag in SHARE ROW EXCLUSIVE mode. The manual:

This mode protects a table against concurrent data changes, and is self-exclusive so that only one session can hold it at a time.

If you have lots of concurrent transactions writing to the table, performance will deteriorate quickly either way.

To only defend against concurrent UPDATE / DELETE commands in the default isolation level READ COMMITTED, lock all rows of the given article_ref in art_tag and wrap the whole operation into a single transaction.

Solution

There is a smarter alternative: take a FOR NO KEY UPDATE lock on the one row in the master table art. (FOR NO KEY UPDATE is a bit less restrictive than the common FOR UPDATE and good enough for our purpose.)

This is assuming that you have no other, possibly racing, concurrent write access to art_tag that ignores locks on art. It's your responsibility to make sure of it.

BEGIN;

SELECT * FROM art WHERE id = 1      -- we do not actually retrieve rows
FOR NO KEY UPDATE;                  -- only to lock the master row exclusively

DELETE FROM art_tag
WHERE  tag_ref <> ALL ('{1,2,3}'::bigint[]);

INSERT INTO art_tag (article_ref, tag_ref)
SELECT 1, unnest('{1,2,3}'::bigint[])
ON CONFLICT DO NOTHING;             -- requires Postgres 9.5 or later

COMMIT;                             -- locks released at end of transaction

To fine-tune behavior in case of conflicts consider the NOWAIT or SKIP LOCKED modifiers.

Related:

Function

You can wrap this into a function to simplify parameter passing and have an automatic transaction wrapper:

CREATE OR REPLACE FUNCTION f_set_tags(_article bigint, VARIADIC _tags bigint[])
  RETURNS void AS
$func$
BEGIN
   PERFORM * FROM art WHERE id = _article
   FOR NO KEY UPDATE;

   DELETE FROM art_tag
   WHERE  tag_ref <> ALL (_tags);

   INSERT INTO art_tag (article_ref, tag_ref)
   SELECT _article, unnest(_tags)
   ON CONFLICT DO NOTHING;
END
$func$ LANGUAGE plpgsql;

Call:

SELECT f_set_tags(1,1,2,3);  -- 1st param is article_ref, rest are tag_ref

Or, if you have an actual array to pass:

SELECT f_set_tags(1, VARIADIC '{1,2,3}');
SELECT f_set_tags(1, VARIADIC your_array_variable);

For small sets of tags (max. 100), a VARIADIC parameter should be particularly convenient, to pass a list of values for the array. You can still pass an actual ARRAY (with any number of elements). Details:

Could be a plain SQL function as well (switch back SELECT for PERFORM). I expect PL/pgSQL to be a bit faster for repeated calls in the same session (prepared statements).

Caution! Locks are only released at the end of a transaction, so it might be unwise to call this function in the context of a bigger outer transaction!