Postgresql – ny performance advantage to marking a SQL function as LEAKPROOF

functionsperformancepostgresql

PostgreSQL has an attribute LEAKRPOOF which you can use to declare a function,

LEAKPROOF indicates that the function has no side effects. It reveals no information about its arguments other than by its return value. For example, a function which throws an error message for some argument values but not others, or which includes the argument values in any error message, is not leakproof. This affects how the system executes queries against views created with the security_barrier option or tables with row level security enabled. The system will enforce conditions from security policies and security barrier views before any user-supplied conditions from the query itself that contain non-leakproof functions, in order to prevent the inadvertent exposure of data. Functions and operators marked as leakproof are assumed to be trustworthy, and may be executed before conditions from security policies and security barrier views.

The docs go on to say,

The query planner has more flexibility when dealing with functions that have no side effects. Such functions are referred to as LEAKPROOF, and include many simple, commonly used operators, such as many equality operators. The query planner can safely allow such functions to be evaluated at any point in the query execution process, since invoking them on rows invisible to the user will not leak any information about the unseen rows. Further, functions which do not take arguments or which are not passed any arguments from the security barrier view do not have to be marked as LEAKPROOF to be pushed down, as they never receive data from the view. In contrast, a function that might throw an error depending on the values received as arguments (such as one that throws an error in the event of overflow or division by zero) is not leak-proof, and could provide significant information about the unseen rows if applied before the security view's row filters.

Is the following LEAKRPOOF function

CREATE FUNCTION foo_leakproof(bar int)
RETURNS int AS $$
  SELECT bar;
$$
IMMUTABLE LEAKPROOF
LANGUAGE sql;

Ever planned differently or does it run with a different runtime profile than the LEAKPROOF lacking alternative,

CREATE FUNCTION foo_leaky(bar int)
RETURNS int AS $$
  SELECT bar;
$$
IMMUTABLE
LANGUAGE sql;

Best Answer

Seems like there is no difference in the most simple cases between thee two performance profiles,

CREATE TABLE t1 AS
  SELECT x::int FROM generate_series(1,1e6) AS gs(x);

CREATE VIEW v1 AS TABLE t1;

CREATE VIEW v2 WITH (security_barrier) 
  AS TABLE t1;

-- BASE TABLE
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM t1;
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM t1;
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM t1;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM t1;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM t1;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM t1;

-- VIEW
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM v1;
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM v1;
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM v1;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM v1;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM v1;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM v1;

-- VIEW with SECURITY_BARRIER
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM v2;
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM v2;
EXPLAIN ANALYZE SELECT foo_leaky(x) FROM v2;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM v2;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM v2;
EXPLAIN ANALYZE SELECT foo_leakproof(x) FROM v2;

These were all 100-110ms. No difference in performance.

However changing the functions to use

CREATE FUNCTION foo_leakproof(bar int)
RETURNS int AS $$
  SELECT bar+1;
$$
IMMUTABLE LEAKPROOF
LANGUAGE sql;

CREATE FUNCTION foo_leaky(bar int)
RETURNS int AS $$
  SELECT bar+1;
$$
IMMUTABLE
LANGUAGE sql;

Will make all the calls slower, but the calls on v2 (our view with a security_barrier) massively slower (2x slower) on the version with and without LEAKPROOF. This is also in agreement with the docs on security_barrier,

Views created with the security_barrier may perform far worse than views created without this option. In general, there is no way to avoid this: the fastest possible plan must be rejected if it may compromise security. For this reason, this option is not enabled by default.

;tldr LEAKPROOF seems to never make queries slower. However, it's not making them faster either -- in my tests. Perhaps it gives the ability to greater-optimize a query on a view with a security_barrier, but I haven't created that environment. security_barrier massively slows down the view, and should only be used when needed.

Benchmarks done with PostgreSQL 10.4.