Repopulating Oracle tables using data from Excel/CSV file

csvoraclesql-loader

I'm developing a Java EE application that utilizes Hibernate/JPA. Some of the data I want to use comes from another Business unit's spreadsheet that I need to periodically (re)load into 5 Oracle tables, overwriting any existing data. I'm using a shell script that kicks off 5 sql*loader control files, where each control file specifies filler fields to skip because the excel file is not normalized.

OPTIONS(skip=2)
load data
REPLACE
--#########################################
into table TABLE_NAME
 fields terminated by "," optionally enclosed by '"'       
TRAILING NULLCOLS
  (
ACCOUNT_ID ,
REST_STOP_ID ,
DESCRIPTION FILLER ,
GROUP FILLER ,
FUNCTION  ,
JOB_ID "substr(:JOB_ID, 0, INSTR(:JOB_ID,' - ',1))", 
<35 other fields>
etc...

First time import of this data into empty tables using SQL*Loader and a csv file from this spreadsheet works fine. But then any subsequent run sql*loader fails because either control file option:

when using REPLACE:

SQL*Loader-926: OCI error while executing delete/truncate (due to REPLACE/TRUNCATE keyword)
**ORA-02292:** integrity constraint (SCHEMA.FK5BA979794B0A176A) violated - child record found

when using TRUNCATE:

SQL*Loader-926: OCI error while executing delete/truncate (due to REPLACE/TRUNCATE keyword)
**ORA-02266:** unique/primary keys in table referenced by enabled foreign keys

I've also tried reordering which the tables are populated but no luck.

I've also tried disabling constraints

ALTER TABLE TABLE_NAME DISABLE CONSTRAINT SYS_C0090398


ORA-02297: cannot disable constraint (SCHEMA.SYS_C0090398) - dependencies exist
(0 rows affected)

My work around so far is to drop all the tables restart my application and have Hibernate recreate my 5 tables and then run my sql*loader script. Right now I'm in development so not a big deal. But when we go to production, I can't keep restarting and creating tables. Actually when I go to production, I'll create the tables outside of hibernate but right now I have the persistence option hibernate.hbm2ddl.auto set to update.

I'm looking for an elegant (semi)automatic solution to periodically repopulate these 5 tables with data provided to me in the csv/Excel file.

Best Answer

For reusable data loads, I prefer defining static external tables that read from CSV files. You'll need an Oracle DIRECTORY object to read the file.

This example creates a five-column external table called FOO_TAB that reads the FOO_TAB.CSV file from the FOO_DIR directory.

CREATE TABLE foo_tab
(
   a1   VARCHAR2(4000),
   a2   VARCHAR2(4000),
   a3   VARCHAR2(4000),
   a4   VARCHAR2(4000),
   a5   VARCHAR2(4000),
)
ORGANIZATION EXTERNAL
(
   TYPE ORACLE_LOADER
   DEFAULT DIRECTORY foo_dir
   ACCESS PARAMETERS
   (
      RECORDS DELIMITED BY newline
      BADFILE 'foo_tab.bad'
      DISCARDFILE 'foo_tab.dis'
      LOGFILE 'foo_tab.log'
      SKIP 1 /* first row just has column headers */
      FIELDS TERMINATED BY ","  OPTIONALLY ENCLOSED BY '"'
      LTRIM
      MISSING FIELD VALUES ARE NULL
      (
         a1   CHAR(4000),
         a2   CHAR(4000),
         a3   CHAR(4000),
         a4   CHAR(4000),
         a5   CHAR(4000),
      )
   )
   LOCATION (foo_dir:'foo_tab.csv')
)
REJECT LIMIT UNLIMITED;