Updating duplicate records with different data

duplicationoracleoracle-11gupdate

I have a migration specific issue. Data from a large table has been migrated into another table, but data for one destination column is missing due to certain issues (that are not specific to this thread). Currently I am trying to update the data for missing column in the new table.

The problem is because of the current data in the newtable, it has duplicate rows and I haven't found a way to do one to one mapping of rows so that data for the missing column can be updated.

create table oldtable
(
fileNo varchar(10),
folder varchar(10),
fileType varchar(10),
col_1 varchar(10),
col_2 varchar(10),
col_3 varchar(10),
col_4 varchar(10)
);

INSERT INTO oldtable (fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('ABC' , 'Folder1','Type1' ,'111' ,'111','111' ,null);
INSERT INTO oldtable (fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('ABC' , 'Folder1','Type2' ,'111' ,'111','111',null);
INSERT INTO oldtable (fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('GHI' , 'Folder3','Type3' ,'333' ,'333','333','333');
INSERT INTO oldtable (fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('JKL' , 'Folder4','Type3' ,'444' ,'444','444','444');
INSERT INTO oldtable (fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('MNO' , 'Folder5','Type4' ,'555' ,'555' ,'555' ,'555');
INSERT INTO oldtable (fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('PQR' , 'Folder6','Type4' ,'666' ,'666' ,'666' ,'666');
INSERT INTO oldtable (fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('STU' , 'Folder7','Type5' ,'777' ,'777','777','777')

create table newtable
(
fileNo varchar(10),
folder varchar(10),
fileType varchar(10),
col_1 varchar(10),
col_2 varchar(10),
col_3 varchar(10),
col_4 varchar(10)
);

INSERT INTO newtable(fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('ABC' , 'Folder1', null ,'111' ,'111','111' ,null);
INSERT INTO newtable(fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('ABC' , 'Folder1', null,'111' ,'111','111',null);
INSERT INTO newtable(fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('GHI' , 'Folder3', null ,'333' ,'333','333','333');
INSERT INTO newtable(fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('JKL' , 'Folder4',null ,'444' ,'444','444','444');
INSERT INTO newtable(fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('MNO' , 'Folder5',null ,'555' ,'555' ,'555' ,'555');
INSERT INTO newtable(fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('PQR' , 'Folder6',null ,'666' ,'666' ,'666' ,'666');
INSERT INTO newtable(fileNo , folder, fileType, COL_1 ,COL_2 ,COL_3 ,COL_4 ) VALUES ('STU' , 'Folder7',null ,'777' ,'777','777','777')

select * from oldtable;

+--------+---------+----------+-------+-------+-------+-------+
| FILENO | FOLDER  | FILETYPE | COL_1 | COL_2 | COL_3 | COL_4 |
+--------+---------+----------+-------+-------+-------+-------+
| ABC    | Folder1 | Type1    |   111 |   111 |   111 |       |
| ABC    | Folder1 | Type2    |   111 |   111 |   111 |       |
| GHI    | Folder3 | Type3    |   333 |   333 |   333 |   333 |
| JKL    | Folder4 | Type3    |   444 |   444 |   444 |   444 |
| MNO    | Folder5 | Type4    |   555 |   555 |   555 |   555 |
| PQR    | Folder6 | Type4    |   666 |   666 |   666 |   666 |
| STU    | Folder7 | Type5    |   777 |   777 |   777 |   777 |
+--------+---------+----------+-------+-------+-------+-------+

select * from newtable;

+--------+---------+----------+-------+-------+-------+-------+
| FILENO | FOLDER  | FILETYPE | COL_1 | COL_2 | COL_3 | COL_4 |
+--------+---------+----------+-------+-------+-------+-------+
| ABC    | Folder1 |          |   111 |   111 |   111 |       |
| ABC    | Folder1 |          |   111 |   111 |   111 |       |
| GHI    | Folder3 |          |   333 |   333 |   333 |   333 |
| JKL    | Folder4 |          |   444 |   444 |   444 |   444 |
| MNO    | Folder5 |          |   555 |   555 |   555 |   555 |
| PQR    | Folder6 |          |   666 |   666 |   666 |   666 |
| STU    | Folder7 |          |   777 |   777 |   777 |   777 |
+--------+---------+----------+-------+-------+-------+-------+ 

When trying to update FileType column for all the records –

update newtable 
set FileType = 
(
    select FileType 
    from oldtable 
    where Fileno = newtable.fileNo 
    and folder = newtable.Folder
);

It would work fine for all other records except first two which are currently duplicate. It won't obviously work because it would give – 'single row subquery returns multiple rows' error because of those duplicate records.
I am trying to update FileType column for first two rows with values "Type1" and "Type2".
Out of those two records, it won't matter which record from newtable is updated with "Type1" and which one gets "Type2".

Please help with any ideas around updating such duplicate records with the specific data from the oldtable into newtable.

I tried using Merge but it gives similar error:

ORA-30926: unable to get a stable set of rows in the source tables.

Not sure how I can use rowid, rownum?

Best Answer

Using the ranking ROW_NUMBER() function will work. First give row numbers to all rows in both tables, then join using these row numbers, then update:

with 
  oldt as
  ( select fileNo , folder, fileType,
           row_number() over (partition by fileNo, folder
                              order by fileType)
             as rn
    from oldtable
  ),

  newt as
  ( select fileNo , folder, fileType,
           row_number() over (partition by fileNo, folder
                              order by fileType)
             as rn
    from newtable
  ),

  upd as
  ( select 
        n.fileType,
        o.fileType as old_fileType
    from newt n
      join oldt o
      on  n.fileNo = o.fileNo
      and n.folder = o.folder
      and n.rn     = o.rn 
  ) 

update
    upd
set 
    fileType = old_fileType ;

SQLfiddle seems to be giving error for Oracle, so it has been tested in SQL Server only: SQLfiddle-test (but this syntax should be valid for Oracle, too).


Tested in Oracle, the above doesn't work, sadly. I think because statements that start with WITH can only be SELECT statements. Even if the query is rearranged (I tried several rewrites), Oracle throws various errors. The only way I managed to have it working is after adding another column in newtable and a unique constraint on it. Then the following works (nid is the added primary key column).

Tested in Oracle's Live SQL site:

update 
  ( with 
      oldt as
      ( select fileNo , folder, fileType,
               row_number() over (partition by fileNo, folder
                                  order by fileType)
                 as rn
        from oldtable
      ),
    newt as
      ( select fileNo , folder, nid,
               row_number() over (partition by fileNo, folder  
                                  order by fileType)
                 as rn
        from newtable
      ),
    upd as
      ( select 
            n.nid,
            o.fileType as old_fileType
        from newt n
          join oldt o
          on  n.fileNo = o.fileNo
          and n.folder = o.folder
          and n.rn     = o.rn 
      ) 
    select 
        up.fileType,
        ( select upd.old_fileType
          from upd 
          where upd.nid = up.nid 
        ) as old_fileType
    from newtable up
  ) x
set fileType = old_fileType ;