I want to embed certain INSERT
, UPDATE
or DELETE
statements into a function, to enhance re-usability. These functions have in common that the are manipulating a single row (if it exists).
An example is the following function:
create function my_table_delete(_identifier int) returns setof int as
$BODY$
begin
return query
delete from my_table
where identifier = _identifier
returning identifier;
end
$BODY$
language plpgsql volatile strict;
This function accepts an identifier, deletes the corresponding row and returns the same identifier in case the delete was successful.
I am quite unsure if I should implement this as a set-returning function (like I did above) or a scalar function.
My reasoning for preferring set-returning function is the following:
-
In case of a scalar function, we will have a
NULL
return value in case the row could not be found (while we will have 0 rows in case of the set-returning function). Having to deal withNULL
s is a pain. -
I'm working with sets (SQL is all about sets), so it makes sense to just keep doing that.
-
Also, I prefer returning the identifier (
1 row
: successful,0 rows
: not successful) instead of aboolean
(true
: successful,false
: not successful), as it would allow constructs such as:
with d(id) as
(
select * from my_table_delete(1234)
)
insert into log_table(id, action)
select d.id, 'DELETE'
from d;
If I were to return a boolean
, the statement above could become less elegant.
I'm interested whether those arguments make sense. I'm quite new to the embedding logic into the database, so I'm not sure if I am addressing this correctly.
What do you think?
On a slightly related note, I also wonder if I should define the function as following:
create function my_table_delete(_identifier int)
returns table (identifier int)
as ...
This gives the output row's column a name, which is even more convenient to use.
Best Answer
First, let's assert that manipulating a single row is positively guaranteed by a
UNIQUE
orPRIMARY KEY
constraint onmy_table.identifier
. Else it should be a table function to begin with.Next: Should you write a function for such a simple
DELETE
at all? Or use the plainDELETE
instead? Barely longer and less overhead. (Maybe your actual function is more sophisticated ...)Then, while your function is that simple, a plain SQL function might serve:
Also using
RETURNS TABLE
like you pondered: Note the table-qualified column name to avoid conflicts. All function parameters (including columns of aRETURNS TABLE
clause) are visible in SQL commands in the function body (in PL/pgSQL as well as in SQL functions). You need to avoid conflicts between column names and parameters.An SQL function has less overhead. But if you call the function many times in the same session, the PL/pgSQL variant may actually be faster since it manages SQL statements like prepared statements, meaning the query plan is saved and reused if some preconditions are met. Then again: much of the same effect is achieved with prepared statements. See:
As to your reasoning:
1. If you don't want to deal with
NULL
, then the case is settled. Make it a table function. But you may have to deal withNULL
after all: Various query constructs (have to) convert "no row" to NULL.2. sounds like an argument, but is just a logical fallacy. Does not carry weight.
3. If it's your personal preference, then the case is settled again. Can't argue with that. But a data-modifying CTE would work with the scalar function just as well:
In summary: go with
RETURNS TABLE
, but it hardly matters for the given case.