Oracle Hierarchical query: with two node attributes NodeId and NodeType

hierarchyoracle-11g-r2recursive

I have the following use case where I want to make use of hierarchical queries to get the desired result.

In my use case I have two types of node say 'A' and 'B'. So the unique identifier of node is its ID and Type.

Now the problem is when two nodes with same Id and different type, when I call connect by clause only on id I get child for other types of nodes also (if id of that node is same).

create table TreeTest (
nodeid integer,
nodetype char(1),
parentid integer,
parenttype char(1)
);

data in table:

   1, 'A', NULL, null
   2, 'A', 1, 'A'
   3, 'A', 1, 'A'   
   2, 'B', NULL, null
   3, 'B', 2, 'B'

Now I am firing the following query (which is not giving the correct result)

SELECT * FROM TREETEST
START WITH PARENTID = 1
CONNECT BY PRIOR nodeid = PARENTID;

I tried following query also, but again wrong result

SELECT * FROM TREETEST
START WITH PARENTID = 1 AND PARENTTYPE = 'A'
CONNECT BY PRIOR nodeid = PARENTID AND NODETYPE = PARENTTYPE;

Kindly provide the correct query and kindly also explain why second query is not working.

Expected output:

2   A   1   A
3   A   1   A

Best Answer

You're missing a PRIOR for the second condition in the connect by.

SQL> select * from TREETEST
     start with PARENTID = 1 and PARENTTYPE = 'A'
     connect by
          prior NODEID = PARENTID
      and prior NODETYPE = PARENTTYPE; -- note the PRIOR here too

    NODEID N   PARENTID P
---------- - ---------- -
         2 A          1 A
         3 A          1 A

Your query looks a bit odd to me, but that might very well be legit. I would have expected the first line in your sample data to be your root node, and thus a query like this:

SQL> select * from TREETEST
     start with NODEID = 1 and NODETYPE = 'A'
     connect by
          prior NODEID = PARENTID
      and prior NODETYPE = PARENTTYPE;

    NODEID N   PARENTID P
---------- - ---------- -
         1 A
         2 A          1 A
         3 A          1 A

And if you didn't want the parent to show up in the result set:

SQL> select * from TREETEST
      where level > 1       -- don't return the root node
      start with NODEID = 1 and NODETYPE = 'A'
      connect by
           prior NODEID = PARENTID
       and prior NODETYPE = PARENTTYPE;

    NODEID N   PARENTID P
---------- - ---------- -
         2 A          1 A
         3 A          1 A

Some references from the Hierarchical Queries documentation:

PRIOR is a unary operator and has the same precedence as the unary + and - arithmetic operators. It evaluates the immediately following expression for the parent row of the current row in a hierarchical query.

If you look at the railroad diagram there, you'll see the syntax is:

starts with condition connect by [nocycle] condition [AND condition]*

And in this case, for the connect by part, each condition is a simple comparison condition with the syntax:

expression operator expression

The PRIOR operator doesn't apply to the whole condition, it is part of the expression(s) in the conditions. I.e. it's as if it was parsed like this:

connect by ( PRIOR(child_column) = parent_column )

(PRIOR can also appear on the right-hand side.)

Your example query:

CONNECT BY PRIOR nodeid = PARENTID AND NODETYPE = PARENTTYPE;

is interpreted as:

CONNECT BY ( PRIOR(nodeid) = PARENTID  ) AND ( NODETYPE = PARENTTYPE );

The second condition (after the AND) applies to the type columns of the current row only, it doesn't reference the parent row at all. The PRIOR is part of the first condition only, and doesn't somehow influence to the second.