Postgresql – Is PostgreSQL function parallel safety intended for functions which update data

cteparallelismpostgresqlpostgresql-12postgresql-performance

I am having some trouble understanding function parallel safety. I can understand what it does – which is to specify whether the function will be executed by the parallel master node or whether the query will totally disable parallelism.

But I am having a hard time understand WHEN to use it. Let me give an example. Say I have a function which only creates tables and does not update any data or insert any rows. What should the parallel safety of this be marked as? I would think to mark it as parallel safe, but then I cannot understand why the manual would list the following as parallel restricted:

  • Scans of common table expressions (CTEs).
  • Scans of temporary tables.

Since both points do not necessarily involve any data update/insert (race condition)?

The case for CTE is particularly strange. I am understanding that if I add a parallel safe function to a CTE call, it will not be executed as parallel safe any longer? And, outside of function parallel safety, I am understanding that all CTEs are executed as parallel restricted? Is there a difference between materialized and not materialized CTEs here?

Best Answer

The reason is that both CTEs and temporary tables are private to the backend process that created them, and parallel worker processes are different processes that do not have access to the private resources of the leader process.

To make CTE scans parallel safe, CTEs would have to be materialized in shared memory.

I would be careful marking a function that creates a table as parallel safe, since that creates locks in the database. Can you guarantee that there can never be any conflicts so that your worker processes lock each other?