I did a tree structure using id and parent_id in the same table. For query's I'm using CTE provide by PostgreSQL, but it's spend so much time to do the joins with recursive results. For example, by the time I have 100 records on sadt_lot table, and this query is spend 8 seconds to return the results. Someone have a better idea to do it?
Example: Listing all sadt's grouped by root sadt_lot's
EXPLAIN ANALYZE WITH RECURSIVE downlots as (
SELECT sl1.sadt_lot_id, 0 AS level, sl1.sadt_lot_id as root_id
FROM sadt_lot sl1
WHERE sl1.parent_lot_id IS NULL
UNION
SELECT sl2.sadt_lot_id, d.level + 1, d.sadt_lot_id as root_id
FROM sadt_lot sl2
INNER JOIN downlots d ON d.sadt_lot_id = sl2.parent_lot_id
)
SELECT sl.sadt_lot_id, array_agg(s.sadt_id)
FROM sadt_lot sl
LEFT JOIN sadt s ON s.sadt_lot_id = any(SELECT sadt_lot_id FROM downlots WHERE root_id = sl.sadt_lot_id)
WHERE sl.parent_lot_id IS NULL
group by sl.sadt_lot_id
ORDEr By sl.sadt_lot_id
Query Plan
GroupAggregate (cost=42.53..15077.74 rows=1 width=36) (actual time=104.090..8436.505 rows=90 loops=1)
Group Key: sl.sadt_lot_id
CTE downlots
-> Recursive Union (cost=0.00..42.39 rows=101 width=12) (actual time=0.006..0.104 rows=95 loops=1)
-> Seq Scan on sadt_lot sl1 (cost=0.00..2.94 rows=1 width=12) (actual time=0.005..0.019 rows=90 loops=1)
Filter: (parent_lot_id IS NULL)
Rows Removed by Filter: 5
-> Hash Join (cost=0.33..3.74 rows=10 width=12) (actual time=0.027..0.028 rows=2 loops=2)
Hash Cond: (sl2.parent_lot_id = d.sadt_lot_id)
-> Seq Scan on sadt_lot sl2 (cost=0.00..2.94 rows=94 width=8) (actual time=0.002..0.008 rows=95 loops=2)
-> Hash (cost=0.20..0.20 rows=10 width=8) (actual time=0.010..0.010 rows=48 loops=2)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> WorkTable Scan on downlots d (cost=0.00..0.20 rows=10 width=8) (actual time=0.001..0.004 rows=48 loops=2)
-> Nested Loop Left Join (cost=0.14..15004.14 rows=6242 width=8) (actual time=8.234..8434.229 rows=11345 loops=1)
Join Filter: (SubPlan 2)
Rows Removed by Join Filter: 1112125
-> Index Only Scan using sadt_lot_sadt_lot_id_parent_lot_id_idx on sadt_lot sl (cost=0.14..12.86 rows=1 width=4) (actual time=0.011..0.252 rows=90 loops=1)
Index Cond: (parent_lot_id IS NULL)
Heap Fetches: 90
-> Seq Scan on sadt s (cost=0.00..635.83 rows=12483 width=8) (actual time=0.002..1.785 rows=12483 loops=90)
SubPlan 2
-> CTE Scan on downlots (cost=0.00..2.27 rows=1 width=4) (actual time=0.003..0.007 rows=1 loops=1123470)
Filter: (root_id = sl.sadt_lot_id)
Rows Removed by Filter: 94
Planning time: 0.203 ms
Execution time: 8436.598 ms
Best Answer
I found the solution. I was using the recursive expression how parameter to join and it was do several lops on the table used on join, a better aprouch is before join with this table(sadt), do the join with recursive expression(downlots "table") and after, using result, do join with the sadt, with that, the query jump from 8sec to 8ms. Follow the solution:
Query Plan