I have an hierarchical query that uses CONNECT BY:
select
wonum,
parent,
classification,
classstructureid,
division,
worktype,
status,
glaccount,
fircode,
actstart,
actfinish,
siteid,
connect_by_root wonum as topworkpackage,
level,
ltrim(sys_connect_by_path(wonum,' \ '),' \ ') wohierarchy
from
workorder
where
istask = 0
and woclass in ('WORKORDER', 'ACTIVITY')
and siteid = 'SERVICES'
connect by prior wonum = parent
start with parent is null
order siblings by classification
Here's a sample of some of the relevant columns from the query:
WONUM | PARENT | CLASSIFICATION | TOPWORKPACKAGE | LEVEL | WOHIERARCHY |
---|---|---|---|---|---|
WO55016 | ROADS | WO55016 | 1 | WO55016 | |
WO55015 | WO55016 | ROADS \ WINTER | WO55016 | 2 | WO55016 \ WO55015 |
WO43181 | WO55015 | ROADS \ WINTER | WO55016 | 3 | WO55016 \ WO55015 \ WO43181 |
WO43183 | WO43181 | ROADS \ WINTER \ A | WO55016 | 4 | WO55016 \ WO55015 \ WO43181 \ WO43183 |
WO37342 | WO43181 | ROADS \ WINTER \ B | WO55016 | 4 | WO55016 \ WO55015 \ WO43181 \ WO37342 |
WO43182 | WO43181 | ROADS \ WINTER \ C | WO55016 | 4 | WO55016 \ WO55015 \ WO43181 \ WO43182 |
Performance:
If I run the query without ordering the CONNECT BY siblings, then the cost is low (1027):
(Note: I *excluded* this line when I ran query: ORDER SIBLINGS BY CLASSIFICATION)
Plan hash value: 1017899897
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 82500 | 74M| 1027 (46)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | CONNECT BY NO FILTERING WITH START-WITH| | | | | |
| 3 | TABLE ACCESS FULL | WORKORDER | 22158 | 1947K| 566 (2)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ISTASK"=0 AND ("WOCLASS"='WORKORDER' OR "WOCLASS"='ACTIVITY') AND
"SITEID"='SERVICES')
2 - access("PARENT"=PRIOR "WONUM")
filter("PARENT" IS NULL)
However, if I order the CONNECT BY siblings, then the cost is high (16975):
(Note: I *included* this line when I ran query: ORDER SIBLINGS BY CLASSIFICATION)
Plan hash value: 1017899897
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 82500 | 74M| 16975 (1)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | CONNECT BY NO FILTERING WITH START-WITH| | | | | |
| 3 | TABLE ACCESS FULL | WORKORDER | 22158 | 1947K| 566 (2)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ISTASK"=0 AND ("WOCLASS"='WORKORDER' OR "WOCLASS"='ACTIVITY') AND
"SITEID"='SERVICES')
2 - access("PARENT"=PRIOR "WONUM")
filter("PARENT" IS NULL)
Question:
As a novice, I'm wondering:
Is there is a way to optimize ordering the CONNECT BY siblings for better performance?
- I've tried adding indexes to the columns that are involved in the ordering/CONNECT BY. But that doesn't seem to help; the indexes don't get used.
Best Answer
Usually when we add "ORDER BY" to a SELECT, the execution cost increases. We get around this by creating/changing the indexes or the "ORDER BY" clause so that they coincide with each other. But even so, it may not be possible or the execution plan may change with time.
There is another solution, sometimes the application wants to access only a small amount of lines at once, in this case you could do a function that will sort only the part that the application will actually use.