Loading XML documents to Oracle 11g DB with control file

oraclesql-loaderxml

I am using Oracle 11g XML database and trying to load XML documents to this DB with a control file and the sqlldr utility. All these XML files have an element that contains a date string with time stamp (and the letter T in the middle of it). Oracle rejects this date string because of T in it and thus the XML file is not loaded to the DB.

I want to use the Oracle function TO_TIMESTAMP_TZ on the date string during the data load, but I do not know how to do it. That's where I need help. If there is any other way to import the XML (with date string with timestamp), I will try that also.

Here is the date entry in XML file:

<ns3:EntryDateTime cls="U">2013-04-20T21:02:52.468-04:00</ns3:EntryDateTime>

And here is entire code the control file:

load data infile 'filelist.dat' 
  append into table STXP xmltype(XMLDATA)
  ( filename filler char(120), XMLDATA lobfile(filename) terminated by eof )

I believe that I can execute the above control file with the sqlldr utility on SQL*Plus command line also, but not sure about this option. If this is possible, I guess I can ALTER SESSION (to somehow format date string) on command line before executing the control file.

The filelist.dat mentioned above contains entries for input XML file, with one line listing one XML file. The above date entry is required in each XML file. Each XML file has about 50 different elements, some required and some optional. I would greatly appreciate your help.

UPDATE:
I successfully registered the schema, which contains definition for the date string, and 100 other schema, with a script. Since this script is very large, I am posting only 2 registration portions of it:

DECLARE
SCHEMAURL VARCHAR2( 100 );
SCHEMADOC VARCHAR2( 100 );
BEGIN
SCHEMAURL := 'http://www.some.org/stxp/DataTypes.xsd';
SCHEMADOC := 'DataTypes.xsd';
DBMS_XMLSCHEMA.registerSchema( 
    SCHEMAURL, 
    BFILENAME( 'XSD_DIR', SCHEMADOC ),
    LOCAL => TRUE, -- local
    GENTYPES => TRUE,  -- generate object types
    GENBEAN => FALSE, -- no java beans
    GENTABLES => TRUE,  -- generate object tables
    OWNER => USER );
    SCHEMAURL := 'http://www.some.org/stxp/STXP.xsd';
    SCHEMADOC := 'STXP.xsd';
    DBMS_XMLSCHEMA.registerSchema( 
    SCHEMAURL, 
    BFILENAME( 'XSD_DIR', SCHEMADOC ),
    LOCAL => TRUE, -- local
    GENTYPES => TRUE,  -- generate object types
    GENBEAN => FALSE, -- no java beans
    GENTABLES => TRUE,  -- generate object tables
    OWNER => USER );
  END;
  /

The 2nd registration above is the last in the script, and this creates the table STXP, in which I am trying to load about 800 XML files. Each XML file has a root element called stxp.

This is the relevant definition of date string:

 <xsd:simpleType name="DT" xdb:SQLType="TIMESTAMP WITH TIME ZONE">
    <xsd:restriction base="xsd:dateTime"/>
</xsd:simpleType>

And this is how I am using the above definition:

<element name="EntryDateTime" type="oth:DT"/>

When I make the above element optional (for testing purpose) and remove the date string entry (mentioned near the top of this question) from my XML file, the XML file is loaded successfully to Oracle XML database. When I put this entry back to XML file (because it is required), Oracle rejects it.

Because I let Oracle take care of population of STXP table with data from XML files, I am not sure if I can set a trigger to pre-process the date string from the input XML file before saving it in database. i think there is a way to do it in the control file.

Best Answer

I think the way you'll have to do it is a Jokke suggest: instead of defining the DT datatype as TIMESTAMP WITH TIME ZONE, rather define it as VARCHAR2(29) where I think that 29 characters should be enough but you could define it wider.

Then when you extract data from the XML you'll have to convert it to TIMESTAMP WITH TIME ZONE.

However, you could also define the column as an "xsd:dateTime". See http://www.adp-gmbh.ch/xml/schema.html and http://www.w3.org/TR/xmlschema-2/#dateTime