Postgresql – What’s the (Big O) computational complexity of a PostgreSQL recursive common table expression

cteperformancepostgresqlpostgresql-performance

For example, taking this StackOverflow #44620695 question, recursive path aggregation and CTE query for top-down tree postgres as an example, which uses a recursive CTE to traverse a tree structure to determine the paths from a starting node.

enter image description here

The screenshot above shows the the input data, the recursive CTE result, and a visualization of the source data.

Recursive CTE are iterative over the preceding result — right? (as suggested in the accepted answer here) — so would the time complexity be something like O(log n)?

Best Answer

OK, I'm going to propose a solution. Using the excellent article "A Gentle Introduction to Algorithm Complexity Analysis" as a guide, I believe the worst-case complexity of the example in my question (above) is as follows.

Given this recursive CTE:

WITH RECURSIVE nodes(id, src, path, tgt) AS (
    -- Anchor member:
    SELECT id, s, concat(s, '->', t), t,
    array[id] as all_parent_ids
    FROM tree WHERE s = 'a'
UNION ALL
    -- Recursive member:
    SELECT t.id, t.s, concat(path, '->', t), t.t, all_parent_ids||t.id FROM tree t
    JOIN nodes n ON n.tgt = t.s
    AND t.id <> ALL (all_parent_ids)
)
-- Invocation:
SELECT path FROM nodes;
  1. At the Anchor member (query definition), the algorithm selects each row from the table; therefore, at this step the maximum number of iterations (i) and the maximum size (n) is the number of rows in table; i < n, if a starting point within the table is specified.

  2. The Recursive member selects each row from the table, starting from the position specified in the anchor member, so the maximum number of iterations here once again is: i ≤ n.

So, with the recursive CTE above I believe that the overall complexity is Θ(n).