Postgresql – Recursive CTE based on function values significantly slower on Postgres 12 than on 11

ctepostgresqlpostgresql-12postgresql-performancerecursive

Following up on my question about some queries in Postgres 12 being slower than in 11 I think, I was able to narrow down the problem. It seems like one recursive CTE based on function values is the problematic spot.

I was able to isolate a rather smallish SQL query which runs significantly longer on Postgres 12.1 than on Postgres 11.6, like ca 150ms in Postgres 12.1 vs ca 4ms in Postgres 11.6. I was able to reproduce the phenomenon on various systems: on multiple VMs in VirtualBox; via Docker on two different physical machines. (See appendix for docker commands). However, strange enough, I cannot reproduce it on https://www.db-fiddle.com/ (no difference to be seen there, both are fast).

Now for the query.
First, we create this simple function

CREATE OR REPLACE FUNCTION public.my_test_function()
 RETURNS SETOF record
 LANGUAGE sql
 IMMUTABLE SECURITY DEFINER
AS $function$ 

SELECT 
        1::integer AS id,
        '2019-11-20'::date AS "startDate",
        '2020-01-01'::date AS "endDate"

$function$;

Then for the actual query

WITH  "somePeriods" AS  (
      SELECT * FROM my_test_function() AS 
      f(id integer, "startDate" date, "endDate" date)
),

"maxRecursiveEndDate" AS (

SELECT "startDate", "endDate", id, 
( 
  WITH RECURSIVE prep("startDateParam", "endDateParam") AS (

  SELECT "startDate","endDate" FROM "somePeriods" WHERE id = od.id
  UNION
  SELECT "startDate","endDate" FROM "somePeriods", prep
  WHERE
    "startDate" <= ("endDateParam" + '1 day'::interval ) AND ("endDateParam" + '1 day'::interval ) <= "endDate"
  )
  SELECT max("endDateParam") FROM prep
) AS "endDateNew"

FROM "somePeriods" AS od

)

SELECT * FROM "maxRecursiveEndDate";

What this actually does it not so important here, I guess. The important point is probably that there are multiple CTEs involved, including a RECURSIVE one.

What I tried:

  • I did try without my_test_function, i.e. putting the values directly into the first CTE. This way, there was no problem at all. Runs equally fast on 12 and on 11.
  • On Postgres 12, I played around with MATERIALIZED, but could see no effect. The query still runs as slow as before.

I don't know whether this could actually be a Postgres 12 bug (or performance regression) or whether I'm missing something here.

Appendix: Docker commands I used for reproducing

First, pull images of both versions

docker pull postgres:12.1
docker pull postgres:11.6

Now, run Postgres 12

docker run -d --name my_postgres_12_container postgres:12.1

Now, execute the query

docker exec my_postgres_12_container psql -U postgres -c "

CREATE OR REPLACE FUNCTION public.my_test_function()
 RETURNS SETOF record
 LANGUAGE sql
 IMMUTABLE SECURITY DEFINER
AS \$function\$ 

SELECT 
        1::integer AS id,
        '2019-11-20'::date AS \"startDate\",
        '2020-01-01'::date AS \"endDate\"

\$function\$;

EXPLAIN ANALYZE WITH  \"somePeriods\" AS  (
      SELECT * FROM my_test_function() AS 
      f(id integer, \"startDate\" date, \"endDate\" date)
),

\"maxRecursiveEndDate\" AS (

SELECT \"startDate\", \"endDate\", id, 
( 
  WITH RECURSIVE prep(\"startDateParam\", \"endDateParam\") AS (

  SELECT \"startDate\",\"endDate\" FROM \"somePeriods\" WHERE id = od.id
  UNION
  SELECT \"startDate\",\"endDate\" FROM \"somePeriods\", prep
  WHERE
    \"startDate\" <= (\"endDateParam\" + '1 day'::interval ) AND (\"endDateParam\" + '1 day'::interval ) <= \"endDate\"
  )
  SELECT max(\"endDateParam\") FROM prep
) AS \"endDateNew\"

FROM \"somePeriods\" AS od

)

SELECT * FROM \"maxRecursiveEndDate\";
"

Stop the Postgres 12 container

docker stop my_postgres_12_container

Start Postgres 11 for comparison

docker run -d --name my_postgres_11_container postgres:11.6

Execute the query in Postgres 11

docker exec my_postgres_11_container psql -U postgres -c "

CREATE OR REPLACE FUNCTION public.my_test_function()
 RETURNS SETOF record
 LANGUAGE sql
 IMMUTABLE SECURITY DEFINER
AS \$function\$ 

SELECT 
        1::integer AS id,
        '2019-11-20'::date AS \"startDate\",
        '2020-01-01'::date AS \"endDate\"

\$function\$;

EXPLAIN ANALYZE WITH  \"somePeriods\" AS  (
      SELECT * FROM my_test_function() AS 
      f(id integer, \"startDate\" date, \"endDate\" date)
),

\"maxRecursiveEndDate\" AS (

SELECT \"startDate\", \"endDate\", id, 
( 
  WITH RECURSIVE prep(\"startDateParam\", \"endDateParam\") AS (

  SELECT \"startDate\",\"endDate\" FROM \"somePeriods\" WHERE id = od.id
  UNION
  SELECT \"startDate\",\"endDate\" FROM \"somePeriods\", prep
  WHERE
    \"startDate\" <= (\"endDateParam\" + '1 day'::interval ) AND (\"endDateParam\" + '1 day'::interval ) <= \"endDate\"
  )
  SELECT max(\"endDateParam\") FROM prep
) AS \"endDateNew\"

FROM \"somePeriods\" AS od

)

SELECT * FROM \"maxRecursiveEndDate\";
"

Best Answer

Thanks to the helpful guys at the pgbugs mailing list, it turned out that Just-in-time-compilation (some helpful background information to be found here) being turned on by default in PostgreSQL 12 had been my problem.

Running my query with SET jit = off; solved the issue: Without it, my query runs fast as it should.