Postgresql – Does bool_and short-circuit function calls with arguments

aggregatefunctionsoptimizationpostgresql

Suppose I have the following query:

SELECT a.*
FROM a
CROSS JOIN b
GROUP BY a.id -- Primary key
HAVING bool_and(expensive_comparison(a, b))

What will PostgreSQL do with this query? Will it process expensive_comparison on all pairs, or will it stop and return immediately for a given for a given a.id?

Is there a more efficient, equivalent way to write this query without resorting to procedural code?

I'm using PostgreSQL 9.3, but if it's changed in any newer versions, that would be relevant info.

Best Answer

It depends on whether the funciton is IMMUTABLE or not. All tests done on PostgreSQL 9.5.

CREATE FUNCTION f_expensive_volatile ()
RETURNS bool AS $$
  SELECT true
  FROM pg_sleep(1)
$$ LANGUAGE SQL
VOLATILE;

CREATE FUNCTION f_expensive_stable ()
RETURNS bool AS $$
  SELECT true
  FROM pg_sleep(1)
$$ LANGUAGE SQL
STABLE;

CREATE FUNCTION f_expensive_immutable ()
RETURNS bool AS $$
  SELECT true
  FROM pg_sleep(1)
$$ LANGUAGE SQL
IMMUTABLE;

Now we create sample data.

CREATE TABLE f AS
SELECT 1 AS id, *
FROM generate_series(1,100) AS gs(x); 

Now on the IMMUTABLE one, note Execution time: 0.138 ms,

EXPLAIN ANALYZE
  SELECT id
  FROM f
  GROUP BY id
  HAVING bool_and( f_expensive_immutable() );

                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 HashAggregate  (cost=2.50..2.51 rows=1 width=4) (actual time=0.083..0.083 rows=1 loops=1)
   Group Key: id
   Filter: bool_and(true)
   ->  Seq Scan on f  (cost=0.00..2.00 rows=100 width=4) (actual time=0.015..0.030 rows=100 loops=1)
 Planning time: 1001.328 ms
 Execution time: 0.138 ms
(6 rows)

And on the STABLE one... note me not waiting around

EXPLAIN ANALYZE
  SELECT id
  FROM f
  GROUP BY id
  HAVING bool_and( f_expensive_stable() );

And also this exemplifies me not waiting around...

EXPLAIN ANALYZE
  SELECT id
  FROM f
  GROUP BY id
  HAVING bool_and( f_expensive_volatile() );

With arguments

CREATE FUNCTION f_expensive_immutable (int)
RETURNS bool AS $$
  SELECT true
  FROM pg_sleep(1)
$$ LANGUAGE SQL
IMMUTABLE;

EXPLAIN ANALYZE
  SELECT id
  FROM f
  GROUP BY id
  HAVING bool_and( f_expensive_immutable(id) );

Yes, it seems that optimization is lost if you provide arguments