Oracle hierarchical update (reparent)

oracle

I have been reading on Oracle hierarchical queries and all of the documentation/examples seem to be focused on SELECTing.

I have a table with parent-child relations and would like to change parents of children. Everything is done on the same table.

  • I can select current parents
  • I can select new parents
  • Old/new parent pairs have a common (otherwise unique) identifier

The data looks something like this:

ID | DISABLED | NAME | PARAM | PARENT
1  | 1        | qwer | 11    | NULL
2  | NULL     | qwer | 12    | NULL
3  | NULL     | tyui | 3     | 1
4  | NULL     | wert | 4     | 1
5  | NULL     | erty | 5     | 1
6  | 1        | asdf | 21    | NULL
7  | NULL     | asdf | 22    | NULL
8  | NULL     | ghjk | 6     | 6
9  | NULL     | hjkl | 7     | 6
10 | NULL     | jklp | 8     | 6

And expected output (changed IDs: 3,4,5 and 8,9,10):

ID | DISABLED | NAME | PARAM | PARENT
1  | 1        | qwer | 11    | NULL
2  | NULL     | qwer | 12    | NULL
3  | NULL     | tyui | 3     | 2
4  | NULL     | wert | 4     | 2
5  | NULL     | erty | 5     | 2
6  | 1        | asdf | 21    | NULL
7  | NULL     | asdf | 22    | NULL
8  | NULL     | ghjk | 6     | 7
9  | NULL     | hjkl | 7     | 7
10 | NULL     | jklp | 8     | 7

EXTRA INFO:
The data consists of multiple roots, each having multiple children (I'm testing on a table with ~20k records). The data has only 2 levels. Although I am interested in the PARAM value, I cannot simply update that, because part of application references "correct" roots and another part references children pointing to broken roots, so reparenting is the only option.

  • I can distinguish roots from children (PARENT IS NULL)
  • I can find roots to be changed (DISABLED IS NOT NULL)
  • I can find replacement root IDs (NAME columns are identical)

Parents to be changed are not static (have to be found in database).

To better illustrate this is how replacement pairs can be constructed (not good at this):

SELECT
  NEW.ID REPLACEMENT,
  OLD.ID TO_REPLACE
FROM
  HIERARCHICAL OLD
LEFT JOIN
  HIERARCHICAL NEW
  ON
    OLD.NAME = NEW.NAME
    AND NEW.DISABLED IS NULL
WHERE
  OLD.PARENT IS NULL
  AND OLD.DISABLED IS NOT NULL
  AND NEW.ID is NOT NULL

I would like to update the table so that entry 3 is parented to 2 instead of 1. Of course there are multiple disabled roots so handwriting updates would be a pain.

I cannot find examples to at least get on track with this, so anything would be appreciated.


I can do this with LOOPs, something like:

DECLARE CURSOR c1 IS 
  SELECT OLDPARENT, NEWPARENT FROM
  <...>;
BEGIN 
  FOR item IN c1 LOOP
    EXECUTE IMMEDIATE 'UPDATE <...>'
  END LOOP; 
END;

But performance of such query is dreadful.

Best Answer

To be honest, with only 2 levels of data, it could be simply achieved with something like this:

drop table t1 purge;

create table t1
(
  id number,
  disable number,
  name varchar2(10),
  param number,
  parent number
);

insert into t1 values (1,  1,    'qwer', 11, null);
insert into t1 values (2,  null, 'qwer', 12, null);
insert into t1 values (3,  null, 'tyui', 3,  1   );
insert into t1 values (4,  null, 'wert', 4,  1   );
insert into t1 values (5,  null, 'erty', 5,  1   );
insert into t1 values (6,  1,    'asdf', 21, null);
insert into t1 values (7,  1,    'asdf', 22, null);
insert into t1 values (8,  null, 'ghjk', 6,  7   );
insert into t1 values (9,  null, 'hjkl', 7,  7   );
insert into t1 values (10, null, 'jklp', 8,  7  );
insert into t1 values (11, null, 'asdf', 23, null);
commit;

I added another row, so 'asdf' has a third entry and we expect parent 7 to turn in to parent 11.

SQL> select * from t1;

        ID    DISABLE NAME            PARAM     PARENT
---------- ---------- ---------- ---------- ----------
         1          1 qwer               11
         2            qwer               12
         3            tyui                3          1
         4            wert                4          1
         5            erty                5          1
         6          1 asdf               21
         7          1 asdf               22
         8            ghjk                6          7
         9            hjkl                7          7
        10            jklp                8          7
        11            asdf               23

And the update:

update t1 set parent = 
(
  select new_parent_id from
  (
    select id, name, disable, parent, lead(id) over (partition by name order by id) as new_parent_id , row_number() over (partition by name order by id desc ) as r from t1
  ) new_parent
  where disable = 1 and parent is null and r = 2 and new_parent.id = t1.parent
)
where t1.parent in (select id from t1 where disable = 1);

6 rows updated.

The result:

SQL> select * from t1;

        ID    DISABLE NAME            PARAM     PARENT
---------- ---------- ---------- ---------- ----------
         1          1 qwer               11
         2            qwer               12
         3            tyui                3          2
         4            wert                4          2
         5            erty                5          2
         6          1 asdf               21
         7          1 asdf               22
         8            ghjk                6         11
         9            hjkl                7         11
        10            jklp                8         11
        11            asdf               23

The above statement updates rows with disabled parents, and finds the new parent value within the subquery. Notice how 6 rows were updated, but the subquery was executed only twice, once per parent value, not for every row.

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT          |      |      1 |        |      0 |00:00:00.01 |      38 |       |       |          |
|   1 |  UPDATE                   | T1   |      1 |        |      0 |00:00:00.01 |      38 |       |       |          |
|*  2 |   HASH JOIN SEMI          |      |      1 |      6 |      6 |00:00:00.01 |      10 |  1344K|  1344K|  788K (0)|
|   3 |    TABLE ACCESS FULL      | T1   |      1 |     11 |     11 |00:00:00.01 |       7 |       |       |          |
|*  4 |    TABLE ACCESS FULL      | T1   |      1 |      3 |      3 |00:00:00.01 |       3 |       |       |          |
|*  5 |   VIEW                    |      |      2 |     11 |      2 |00:00:00.01 |      14 |       |       |          |
|*  6 |    WINDOW SORT PUSHED RANK|      |      2 |     11 |     20 |00:00:00.01 |      14 |  2048 |  2048 | 2048  (0)|
|   7 |     WINDOW SORT           |      |      2 |     11 |     22 |00:00:00.01 |      14 |  2048 |  2048 | 2048  (0)|
|   8 |      TABLE ACCESS FULL    | T1   |      2 |     11 |     22 |00:00:00.01 |      14 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."PARENT"="ID")
   4 - filter("DISABLE"=1)
   5 - filter(("DISABLE"=1 AND "PARENT" IS NULL AND "R"=2 AND "NEW_PARENT"."ID"=:B1))
   6 - filter(ROW_NUMBER() OVER ( PARTITION BY "NAME" ORDER BY INTERNAL_FUNCTION("ID") DESC )<=2)