Postgresql – Performing SELECT on EACH ROW in CTE or Nested QUERY

ctepostgresqltree

This is a problem in PostgreSQL

I have a table which stores the tree of users;

    +------+---------+
    |  id  | parent  |
    |------+---------|
    |  1   |   0     |
    |------|---------|
    |  2   |   1     |
    |------|---------|
    |  3   |   1     |
    |------|---------|
    |  4   |   2     |
    |------|---------|
    |  5   |   2     |
    |------|---------|
    |  6   |   4     |
    |------|---------|
    |  7   |   6     |
    |------|---------|
    |  8   |   6     |
    +------+---------+

I can query a complete tree from any node by using the connectby function from the tablefunc extension, and I can separately query the size of tree in terms of total nodes in it, for example

tree for #1 has size 7
tree for #5 has size 0
tree for #6 has size 2, and so on

Now I want to do something like Selecting all possible trees from this table (which is again carried out by connectby), count the size of it and create another dataset with records of ID and size of underlying tree, like this:

    +------------------+-------------+
    |  tree_root_node  |  tree_size  |
    |------------------+-------------|
    |      1           |     7       |
    |------------------+-------------|
    |      2           |     3       |
    |------------------+-------------|
    |      3           |     0       |
    |------------------+-------------|
    |      4           |     3       |
    |------------------+-------------|
    |      5           |     0       |
    |------------------+-------------|
    |      6           |     2       |
    |------------------+-------------|
    |      7           |     0       |
    |------------------+-------------|
    |      8           |     0       |
    +------------------+-------------+

The problem is, I am unable to perform the same SELECT statement for every available row in original table in order to fetch the tree and calculate the size, and even if I could, I dont know how to create a separate dataset using the fetched and calculated data.

I am not sure if this could be simple use of some functions available in Postgres or I'd have to write a function for it or simply I dont know what exactly is this kind of query is called but googling for hours and searching for another hour over here at dba.stackexchange returned nothing.

Can someone please point to right direction ?

Best Answer

Let me know if this helps. This is using MSSQL (T-SQL) syntax so you might have to adjust it for postgres, but, using a temp table called #tree, containing the two columns id and parent and populated as your example is with

SELECT 1, 0
UNION 
SELECT 2, 1
UNION 
SELECT 3, 1
UNION
SELECT 4, 2
UNION 
SELECT 5, 2
UNION 
SELECT 6, 4
UNION
SELECT 7, 6
UNION 
SELECT 8, 6:

with cte as
(
    select id as currentnode, id as root
    from #tree
    union all
    select t.id as currentnode, cte.root 
    from #tree t join cte on t.parent = cte.currentnode
)
select *
from cte
order by root

The above recursive CTE will yield all nodes reachable starting from some root. You can do:

SELECT root, count(*)
from cte
group by root

The count(*) query should yield,

root   |  count
----------------
  1    |    8
  2    |    6
  3    |    1
  4    |    4
  5    |    1
  6    |    3
  7    |    1
  8    |    1

You could then subtract 1 from each of those to get the result you were hinting at. Is this what you were looking for? Even if not, I feel like a recursive CTE can probably help with what you need.