Postgresql – Listing function input and output details for stored functions in Postgres 11.5+

apifunctionspostgresql

I'm trying to build a query that extracts the input and output definitions for the functions in a schema. We've got multiple client types written in multiple languages, and plan to move many queries into stored functions. That way, each client can call a function without writing their own SQL code. I'd like to at least semi-automate the documentation, so I want to extract the inputs and outputs for each method. As an example, here's the sort of output I'm looking for from a sample method, listed below:

schema  function_name   lang    return_type run_as  owner_name  strict  returnset   volatile    comment is_input    item_number name    type    default
api push_log_count_since    plpgsql record  INVOKER user_bender f   t   v       t   1   since_dts   timestamptz 
api push_log_count_since    plpgsql record  INVOKER user_bender f   t   v       f   1   days_ago    int8    
api push_log_count_since    plpgsql record  INVOKER user_bender f   t   v       f   2   table_name  citext  
api push_log_count_since    plpgsql record  INVOKER user_bender f   t   v       f   3   server_name citext  
api push_log_count_since    plpgsql record  INVOKER user_bender f   t   v       f   4   push_count  int8    

The method has one input named since_dts and returns a table of results with four columns, as listed above and shown below:

CREATE OR REPLACE FUNCTION api.push_log_count_since (since_dts timestamptz)
    RETURNS TABLE(
        days_ago int8,
        table_name extensions.citext,
        server_name extensions.citext,
        push_count int8) 

AS $BODY$

DECLARE
-- Subtract two timestamptz values, get an interval, pull out just the days.
   days_ago bigint := date_part('day', now()::timestamptz - since_dts)::bigint;

BEGIN
RETURN QUERY
    SELECT days_ago,
           ib_table_name as table_name,
           data_file_info.server_name_ as server_name,
           count(*) as push_count

      FROM ascendco.push_log
      JOIN ascendco.data_file_info on (data_file_info.id = push_log.data_file_id)

     WHERE push_log.push_dts >= since_dts

  GROUP BY ib_table_name,
           data_file_info.server_name_

  ORDER BY ib_table_name,
           data_file_info.server_name_
           ;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE SECURITY DEFINER
  COST 100
  ROWS 1000;

ALTER FUNCTION api.push_log_count_since (timestamptz) 
    OWNER TO user_bender;

I'm not wedded to my big flat output, and will likely convert the data to a JSON:

{ 
   "schema":"api",
   "name":"push_log_count_since",
   "lang":"plpgsql",
   "return_type":"record",
   "run_as":"INVOKER",
   "owner_name":"user_bender",
   "strict":"false",
   "returnset":"true",
   "volatile":"v",
   "comment":"An COMMENT ON value set on the function.",
   "inputs":[ 
      { 
         "item_number":1,
         "name":"since_dts",
         "type":"timestamptz",
         "default":null
      }
   ],
   "outputs":[ 
      { 
         "item_number":1,
         "name":"days_ago",
         "type":"int8"
      },
      { 
         "item_number":2,
         "name":"table_name",
         "type":"citext"
      },
      { 
         "item_number":3,
         "name":"sever_name",
         "type":"citext"
      },
      { 
         "item_number":4,
         "name":"push_count",
         "type":"int8"
      }
   ]
}

It looks like the data I need is in the pg_proc system catalog, but I'm not sure how to extract it in a way that I can readily use. I've been bashing away at this, and have gotten a lot closer than when I originally posted. The information_schema.parameters view makes things easier. This script is a bit involved, and I'd guess that there are much simpler ways to achieve the same results. I'd be grateful for any help or suggestions on how to improve this.

-- The script below takes a function's name+oid and returns a row for each input or output parameter.
-- Information about the routine overall is included on each row, such as schema, name, comments.
-- For each parameter, you get the absolute ordinal position, positin in the input/output list,
-- data type and default value. It's a bit of a Frankenscript, but it's getting close to what I'm after.
-- If I can get this working, I'll likely wrap it in a function.
WITH function_name_parts AS (
-- The information_shcema catalogs concatenate function_name + oid.
-- This provides a unique name for overloaded functions. Parse these bits out.          
  select * from string_to_array('push_log_count_since_329059','_') as parts
 ),

function_identity AS (
      -- Take the array from above and built up the combined name, name, and oid.
       -- ! There must be a simpler way to do all of this. 

        select array_to_string(parts,'_') AS function_name_and_oid,
                 array_to_string(array_remove_element(parts,cardinality(parts)),'_') AS function_name,
                parts[cardinality(parts)]::oid as function_oid

    from function_name_parts
)

-- Grab information from parameters and routines. I haven't figured out how to get the OWNER out of pg_proc.
-- The pg_get_userbyid(pg_proc.proowner) function seems like it should work, but I've not sorted out how to get that working.
SELECT parameters.specific_catalog AS database_name,
       function_name,
       function_oid,
       ordinal_position,
       row_number() OVER (partition by parameter_mode order by ordinal_position) as input_output_position,
       parameter_mode,
       parameter_name,
       parameters.data_type,
       parameters.udt_name,
       parameter_default,
       external_language,
       security_type,
       get_function_owner_name(function_oid) as owner_name, -- Is there a smarter way to do this than a custom function?
       obj_description(function_oid) AS comment

      FROM function_identity,
              information_schema.routines
  LEFT JOIN information_schema.parameters  ON (information_schema.routines.specific_name = parameters.specific_name)

      WHERE parameters.specific_name = function_identity.function_name_and_oid

   ORDER BY ordinal_position

This returns output like the following:

database_name   function_name   function_oid    ordinal_position    input_output_position   parameter_mode  parameter_name  data_type   udt_name    parameter_default   external_language   security_type   owner_name  comment
squid   push_log_count_since    329059  1   1   IN  since_dts   timestamp with time zone    timestamptz NULL    PLPGSQL DEFINER user_bender NULL
squid   push_log_count_since    329059  2   1   OUT days_ago    bigint  int8    NULL    PLPGSQL DEFINER user_bender NULL
squid   push_log_count_since    329059  3   2   OUT table_name  USER-DEFINED    citext  NULL    PLPGSQL DEFINER user_bender NULL
squid   push_log_count_since    329059  4   3   OUT server_name USER-DEFINED    citext  NULL    PLPGSQL DEFINER user_bender NULL
squid   push_log_count_since    329059  5   4   OUT push_count  bigint  int8    NULL    PLPGSQL DEFINER user_bender NULL

The code for the array_remove_element function is here:

CREATE OR REPLACE FUNCTION tools.array_remove_element(anyarray, int)
  RETURNS anyarray LANGUAGE sql IMMUTABLE AS
'SELECT $1[:$2-1] || $1[$2+1:]';

COMMENT ON FUNCTION tools.array_remove_element(anyarray, int) IS '
From the ever-helpful Erwin Brandstetter, renamed for our schema and idiom.
https://dba.stackexchange.com/questions/94639/delete-array-element-by-index';

ALTER FUNCTION tools.array_remove_element (anyarray, int) 
    OWNER TO user_bender;

The code for the get_function_owner_name function is found below:

CREATE OR REPLACE FUNCTION tools.get_function_owner_name(function_oid oid)
  RETURNS text LANGUAGE sql AS
'select rolname::text from pg_authid where oid = (select pg_proc.proowner from pg_proc where oid = function_oid)';

ALTER FUNCTION tools.get_function_owner_name(function_oid oid) 
    OWNER TO user_bender;

Best Answer

This query will give you the argument types and result types as arrays for any function:

SELECT f.oid::regproc AS function_name,
       f.proargtypes::regtype[] AS argument_types,
       CASE WHEN f.proallargtypes IS NULL
            THEN ARRAY[f.prorettype::regtype]
            ELSE array_agg(args.type::regtype) FILTER (WHERE args.mode = 'o')
       END AS result_types
FROM pg_proc AS f
   LEFT JOIN LATERAL unnest(f.proallargtypes, f.proargmodes) AS args(type, mode)
      ON TRUE
GROUP BY f.oid, f.proallargtypes, f.prorettype, f.proargtypes;

Add a WHERE condition to restrict the output.