Postgresql – How to aggregate over a “multi source” with recursive, in postgres

Imagine I have a table "departments" like this:

name TEXT
parent_department TEXT (nullable)

I also have a "budgets" table like this:

department_name TEXT
budget INTEGER

I need to "traverse" this departments table from "leaf to root", following the parent department column. The end result is that I need to sum up the budgets of each department, but the twist is that the "budgets" table that I'll be using at any given time is very sparse (10s to 100s of rows), while departments is very dense (100m rows).

Normally, I would approach this by creating a "with recursive" CTE that "flattens" the department relationship into a temporary table of just "root department", "leaf department". Then I would join that table on leaf department to budgets, and group by root. The problem is that the "department" table is FAR too big now (the analogy to departments breaks down here a little bit I admit, no company has >100m departments).

Concretely, if I had departments like this (child -> parent):

C -> B
B -> A
A -> NULL
E -> D
D -> NULL
F -> G
G -> NULL

And budgets like this:

A 1
C 3
E 5

I would want to get as output:

A 4
D 5

And ideally, F and G, for example, would never be touched, because there are hundreds of millions of rows like F and G that shouldn't come up. But in my current query, they do. How can I fix this so that I only "traverse" the departments that I actually need to?

WITH RECURSIVE initial AS ( SELECT department_name, department_name, FALSE FROM budgets GROUP BY department_name ), tmp (child, parent, root) AS ( SELECT * FROM initial UNION ALL SELECT tmp.child, CASE WHEN departments.parent_department IS NULL THEN tmp.parent ELSE departments.parent_department END, departments.parent_department IS NULL FROM tmp INNER JOIN departments ON tmp.parent = departments.name WHERE NOT tmp.root ) SELECT tmp.parent, SUM(budget) FROM tmp INNER JOIN budgets ON tmp.child = budgets.department_name WHERE tmp.root GROUP BY tmp.parent

Postgresql – How to aggregate over a “multi source” with recursive, in postgres

Best Answer

Related Question

Best Answer

Related Solutions

Postgresql – How to speed up a Postgres query containing lots of Joins with an ILIKE condition

Postgresql – How to sort the results of a recursive query in an expanded tree-like fashion

Related Question