Postgresql – How to access to other INSERTed/UPDATEd rows from trigger in PostgreSQL

postgresqlpostgresql-9.6trigger

Solution in MS SQL

MS SQL trigger functions have deleted and inserted system tables in which all rows affected by operation are stored. You can count updated rows:

set @updatedCount = (select count(*) from deleted)

or find out minimum value:

set @updatedMinimumCol1 = (select min(col1) from deleted)

Problem with PostgreSQL

For FOR EACH ROW triggers I can use OLD and NEW system records, but they store only 1 row for each call of trigger. Calls of trigger are separated, so if user updates 10 rows, trigger will be called 10 times, but each time I can know only about 1 current row, not about all 10 rows.

For FOR EACH STATEMENT I do not know any mechanism of access to updated rows at all. I use PostgreSQL v9.6, OLD TABLE and NEW TABLE were introduced in v10.

PostgreSQL does not allow the old and new tables to be referenced in statement-level triggers, i.e., the tables that contain all the old and/or new rows, which are referred to by the OLD TABLE and NEW TABLE clauses in the SQL standard.


Try with transaction_timestamp() additional column

I can add special column with DEFAULT transaction_timestamp() to the main table and then use it to distinguish just-now-updated rows from other, but it is not a solution since multiple INSERTs/UPDATEs can be in one transaction and they will have the same timestamp of transaction. Probably I could clear this timestamp column in trigger after each statement to avoid this problem, but how to do this if such clearing will emit update trigger again – will be infinite update trigger calling.

So, this try was failed.


Bad solution in PostgreSQL

The only way I know is that:

First, use FOR EACH ROW trigger to collect current statistics (min and count) like aggregate functions. I use temp table to store it between calls (this trigger gets called 1 time for each row). But we will not know which row is last (when time will come to use this statistics).

CREATE TEMP TABLE IF NOT EXISTS _stats (
  _current_min   int,
  _current_count int
) ON COMMIT DROP;

IF EXISTS(SELECT 1 FROM _stats LIMIT 1) THEN
  --Current row is not first, there is statistics for previous rows.
  UPDATE _stats
  SET _current_min   = (CASE WHEN NEW.col1 < _current_min THEN NEW.col1
                        ELSE _current_min END)
    , _current_count = _current_count + 1;
ELSE
  --There is no stats because current row is first for this INSERT/UPDATE
  INSERT INTO _stats (_current_min, _current_count)
  VALUES (NEW.col1, 1);
END IF;

Second, use FOR EACH STATEMENT trigger to use collected statistics. Do not forget to clear temp table (if user will run multiple INSERTs/UPDATEs in one transaction, old statistics will remain in temp table and corrupt all next calculations!).

For more complex tasks we can create temp tables inserted and deleted on the same way as _stats.


The workaround

In PostgreSQL we can use RETURNING clause for INSERT/UPDATE/DELETE to obtain new values of all rows affected by operation. Then we can manipulate with them, but each function with INSERTs/UPDATEs have to implement this technology ===> 1. additional code in functions with such INSERTs/UPDATEs – duplication of the RETURNING; 2. we can forget to implement such technology to new function; 3. data will be corrupted since required manipulations will not be called automatically (like triggers will).


The question

Maybe, you know a better way to access all rows affected by INSERT/UPDATE?

Best Answer

See the docs, you should be able to access the old and new records from a statement trigger:

CREATE TRIGGER some_table_update_trigger
  AFTER UPDATE ON some_table
  REFERENCING NEW TABLE AS newtab OLD TABLE AS oldtab
  FOR EACH STATEMENT
  EXECUTE PROCEDURE do_something_with_newtab_and_oldtab();