Ordering CONNECT BY siblings

hierarchyoptimizationoracleoracle-19cquery-performance

I have an hierarchical query that uses CONNECT BY:

select 
    wonum,
    parent,
    classification,
    classstructureid,
    division,
    worktype,
    status, 
    glaccount, 
    fircode, 
    actstart,
    actfinish,
    siteid,

    connect_by_root wonum as topworkpackage,
    level,
    ltrim(sys_connect_by_path(wonum,' \ '),' \ ') wohierarchy
from 
    workorder
where
    istask = 0
    and woclass in ('WORKORDER', 'ACTIVITY')
    and siteid = 'SERVICES'

connect by prior wonum = parent
start with parent is null

order siblings by classification

Here's a sample of some of the relevant columns from the query:

WONUM PARENT CLASSIFICATION TOPWORKPACKAGE LEVEL WOHIERARCHY
WO55016 ROADS WO55016 1 WO55016
WO55015 WO55016 ROADS \ WINTER WO55016 2 WO55016 \ WO55015
WO43181 WO55015 ROADS \ WINTER WO55016 3 WO55016 \ WO55015 \ WO43181
WO43183 WO43181 ROADS \ WINTER \ A WO55016 4 WO55016 \ WO55015 \ WO43181 \ WO43183
WO37342 WO43181 ROADS \ WINTER \ B WO55016 4 WO55016 \ WO55015 \ WO43181 \ WO37342
WO43182 WO43181 ROADS \ WINTER \ C WO55016 4 WO55016 \ WO55015 \ WO43181 \ WO43182

Performance:

If I run the query without ordering the CONNECT BY siblings, then the cost is low (1027):

(Note: I *excluded* this line when I ran query: ORDER SIBLINGS BY CLASSIFICATION)

Plan hash value: 1017899897
 
------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |           | 82500 |    74M|  1027  (46)| 00:00:01 |
|*  1 |  FILTER                                  |           |       |       |            |          |
|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|           |       |       |            |          |
|   3 |    TABLE ACCESS FULL                     | WORKORDER | 22158 |  1947K|   566   (2)| 00:00:01 |
------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("ISTASK"=0 AND ("WOCLASS"='WORKORDER' OR "WOCLASS"='ACTIVITY') AND 
              "SITEID"='SERVICES')
   2 - access("PARENT"=PRIOR "WONUM")
       filter("PARENT" IS NULL)

However, if I order the CONNECT BY siblings, then the cost is high (16975):

(Note: I *included* this line when I ran query: ORDER SIBLINGS BY CLASSIFICATION)

Plan hash value: 1017899897
 
------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |           | 82500 |    74M| 16975   (1)| 00:00:01 |
|*  1 |  FILTER                                  |           |       |       |            |          |
|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|           |       |       |            |          |
|   3 |    TABLE ACCESS FULL                     | WORKORDER | 22158 |  1947K|   566   (2)| 00:00:01 |
------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("ISTASK"=0 AND ("WOCLASS"='WORKORDER' OR "WOCLASS"='ACTIVITY') AND 
              "SITEID"='SERVICES')
   2 - access("PARENT"=PRIOR "WONUM")
       filter("PARENT" IS NULL)

Question:

As a novice, I'm wondering:

Is there is a way to optimize ordering the CONNECT BY siblings for better performance?

  • I've tried adding indexes to the columns that are involved in the ordering/CONNECT BY. But that doesn't seem to help; the indexes don't get used.

Best Answer

Usually when we add "ORDER BY" to a SELECT, the execution cost increases. We get around this by creating/changing the indexes or the "ORDER BY" clause so that they coincide with each other. But even so, it may not be possible or the execution plan may change with time.

There is another solution, sometimes the application wants to access only a small amount of lines at once, in this case you could do a function that will sort only the part that the application will actually use.