PostgreSQL CTE – Are CTEs Multithreaded?

cteparallelismpostgresql

Here and here seem to suggest that they are, but I've also been told (source unlocatable) that a multi-row single-statement insert will only use one thread for all writing, that to multithread it, the query should be broken up and sent to as many connections as there are cores.

The documentation states that CTEs are executed "concurrently", but none linked say explicitly in effect that "CTEs will use all available cores to process any parallel sub-statements and/or the main query".

Do CTEs always use all available cores for any parallel tasks? If so, are there any caveats aside from what could be reasonably assumed from chaining with RETURNING?

Best Answer

Postgres currently does not use parallelism when executing a single query. The entire execution engine is single threaded for each query (you can of course still execute multiple queries in parallel). This lack of "intra query parallelism" is one of the great drawbacks of running Postgres or MySQL. Basically, it makes those engines unfit for purpose on large datasets.

To get multi-threaded Postgres execution, you either need to split up the query into multiple queries yourself or run a Postgres compatible variant like Amazon Redshift or Greenplum (the former has added intra query parallelism to the code base, the latter does the splitting for you).

Whenever you feel the need to run a parallel query, first ask yourself: Could I add indexes that would make this fast enough without parallelism? If that answer is "Yes", pursue that option first. Very often though, you need brute force. And when it comes to brute force, vanilla Postgres is unfortunately not a particularly good database engine.