Postgresql – Function that returns table with an additional column

datatypesfunctionspostgresqlset-returning-functionsview

I want to write a PostgreSQL function that returns a table, plus an additional column. Is there a way to do this without manually specifying RETURNS TABLE (col1 type, col2 type, ...)?

For example, consider the following function:

CREATE FUNCTION get_users_with_most_videos_since_time(ts TIMESTAMPTZ)
  RETURNS SETOF "user" AS $$
    SELECT
      u.*,
      count(v.id) AS vids_since

    FROM "user" AS u
    INNER JOIN "video" AS v ON v.creator_id = u.id
    WHERE v.created_at > ts

    GROUP BY u.id
    ORDER BY vids_since DESC;
$$ LANGUAGE SQL;

This fails with the error:

ERROR:  return type mismatch in function declared to return "user"
DETAIL:  Final statement returns too many columns.

Fair enough, we're including that vids_since column, which doesn't exist in the "user" table.

So to fix this I'd want to change it to something like:

CREATE FUNCTION get_users_with_most_videos_since_time(ts TIMESTAMPTZ)
  RETURNS 
    TABLE (<<< all columns from table "user" >>>, vids_since BIGINT) 
  AS $$
    ...

Is there a way to do this without having to re-copy the entire schema of the "user" table here?

Best Answer

AIUI, your wish is to shorten the RETURNS clause of the function. Not sure if you want to establish a dependency on the row type of the table at the same time, but that would make sense here, too.

The form RETURNS SETOFrettype relies on the used type to be stored in the system catalogs. The manual:

The return type can be a base, composite, or domain type, or can reference the type of a table column.

Therefore, just register the row type you desire in the system (once). You can explicitly create a composite type with CREATE TYPE. Or you can do it implicitly by creating another table, view or materialized view. Even just a temporary view or table for a temporary function (dying at the end of the session). Like @Abelisto commented:

CREATE VIEW user_plus AS 
SELECT *, null::bigint AS vids_since
FROM   "user"
WHERE  false;

However, SELECT * is resolved to the list of columns at creation time of the view ("early binding"). If you later add a column to the underlying table, the view and its row type are not updated and you get another type mismatch when executing the function. Trying to delete a column form the underlying table will complain about the column in the view that depends on it, though. And you have to change the view - and the function.

Your function can be simplified and faster, btw:

CREATE FUNCTION get_users_with_most_videos_since_time(_ts timestamptz)
  RETURNS SETOF user_plus AS
$func$
   SELECT *   -- effectively the same as: u.*, v.vids_since
   FROM   "user" AS u
   JOIN  (    -- aggregate *before* you join
      SELECT creator_id AS id, count(*) AS vids_since  -- note the column alias
      FROM   video v
      WHERE  created_at > _ts
      GROUP  BY 1
      ) v USING (id)  -- USING only retains one id column
   ORDER  BY v.vids_since DESC;
$func$  LANGUAGE SQL;

It actually makes sense to have this function. You want to filter by a column that's not in the result. Not possible with a plain VIEW.

Aside: I would discourage using reserved words like user as identifiers. That's a loaded footgun.