Postgresql – Insert values from a record variable into a subclass table

dynamic-sqlparameterplpgsqlpostgresql

My goal is to create a stored procedure that builds up a _tbl_out which is a subclass of a _tbl_in. I'm following the results of this question, but for this to work, the tables need to have the SAME schema. I want the output_table to have the same schema PLUS some extra computed columns.

Will I have to use a less elegant solution than the one linked? Or is there a solution where I can append columns onto the _tbl_in record before I insert it to the subclass _tbl_out?

Here is the code for the previous solution for reference:

CREATE OR REPLACE FUNCTION gesio(_tbl_in anyelement, _tbl_out regclass)
  RETURNS void AS
$func$
BEGIN

FOR _tbl_in IN EXECUTE
   format('SELECT * FROM %s', pg_typeof(_tbl_in))
LOOP
   -- do something with record

   EXECUTE format('INSERT INTO %s SELECT $1.*', _tbl_out)
   USING _tbl_in;
END LOOP;

END
$func$  LANGUAGE plpgsql;

Call (important!):

SELECT gesio(NULL::t, 't1');

t and t1 being the tables with identical schema.

Best Answer

Your goal is ...

a stored procedure that builds up a _tbl_out which is a subclass of a _tbl_in.

And by "subclass" you mean there are additional columns at the end of _tbl_out.

So the function shall take two table names and copy contents from the first to the second - plus one or more additional (computed?) columns. The only thing we know about table definitions: the second has additional dangling columns.

You only need a polymorphic parameter (ANYELEMENT) to pass in or return values of dynamic type - or at least work with a variable of the type in the function body. What you demonstrate does not return anything (RETURNS void) and also has no need for the input value or type. You only use the table name derived from the input parameter and the parameter itself as auxiliary variable - which goes away after simplifying the function.

Unless you have computations depending on the type or value of the polymorphic parameter, two regclass parameters and a single dynamic statement can do the job, and much faster, too. (Or even just two text parameters passing valid table names, but regclass is more reliable):

CREATE OR REPLACE FUNCTION pg_temp.gesio(_tbl_in regclass, _tbl_out regclass)
  RETURNS void AS
$func$
BEGIN
   EXECUTE format($$
      INSERT INTO %s       -- *no* target column list
      SELECT *
           , 'some value'  -- AS computed_column1  -- column alias only for documentation
          -- more ?
      FROM   %s$$$         -- ORDER BY ???
    , _tbl_out, _tbl_in);
END
$func$  LANGUAGE plpgsql;

I added a note to my old answer you referenced that was not clear enough:

Performance with big tables

A major factor for the resulting cost is the Write Ahead Log (WAL) that has to be maintained for this form. There are ways to avoid that completely and make it substantially faster: Write to a completely new table within the transaction, so it's invisible to the rest of the world until committed. Details:

Aside: It's a function, not a stored procedure. Postgres does not have stored procedures, strictly speaking. (The most notable difference: functions always run inside a single transaction scope.)