Oracle – Converting XMLDB Nodes to Rows

oracleoracle-11g-r2xmlxquery

I have an application that stores XML as XMLType in a table. This is the first time I'm using XMLDB and XMLType features, and was wondering if I could use the results from an XQuery FLWOR expression in a "relational" way.

Considering the following example XML:

<PerfInfo TimeStamp="2014-11-23 17:13:04" GUID="5d4a9a3f-5a1c-478c-b12f-39e9f33172f4">
    <EventLog Count="408">
        <EventRecord LogName="Application" Level="2" TimeCreated="2014-11-17 00:18:31" Provider="Application Error" ID="1000">Faulting application name: wmiprvse.exe</EventRecord>
        <EventRecord LogName="Application" Level="2" TimeCreated="2014-11-17 00:18:43" Provider="Application Error" ID="1000">Faulting application name: wmiprvse.exe</EventRecord>
        <EventRecord LogName="Application" Level="2" TimeCreated="2014-11-18 16:59:28" Provider="McLogEvent" ID="5051">A thread in process C:\Program Files\Common Files\McAfee\SystemCore\mcshield.exe took longer than 90000 ms to complete </EventRecord>
    <EventLog/>
<PerfInfo/>

You'll probably notice that these are extracts from the Windows Event Log.

These results (the actual XML files are, of course, way bigger and contain a lot of other metrics, but the principle remains the same) are stored in following table:

CREATE TABLE SPERF_XML_RAW 
(
    "ID"                NUMBER              NOT NULL, 
    "GUID"              CHAR(36 CHAR)       NOT NULL,
    "HOSTNAME"          VARCHAR2(16 CHAR)   NOT NULL,
    "USERNAME"          VARCHAR2(16 CHAR)   NOT NULL,
    "DATA_XML"          PUBLIC.XMLTYPE      NOT NULL,
    "DATE_CREATED"      TIMESTAMP(6)        NOT NULL,
    "DATE_INSERTED"     TIMESTAMP(6)        DEFAULT SYSDATE NOT NULL,
    CONSTRAINT SPERF_XML_RAW_PK PRIMARY KEY(ID)
)

I can use XQuery to, for example, query all the Provider names of an Event Record in the different XML documents, like this:

SELECT ID, GUID, 
   XMLQuery(
       'for $i in /PerfInfo/EventLog/EventRecord
       return $i/@Provider'
       PASSING DATA_XML RETURNING CONTENT
   ) "DATA_THINGS"
FROM SPERF_XML_RAW;

And the results would be along these lines:

1   5d4a9a3f-5a1c-478c-b12f-39e9f33172f4    Application ErrorApplication ErrorMcLogEvent

However, the results I would like to have is this:

1   5d4a9a3f-5a1c-478c-b12f-39e9f33172f4    Application Error
1   5d4a9a3f-5a1c-478c-b12f-39e9f33172f4    Application Error
1   5d4a9a3f-5a1c-478c-b12f-39e9f33172f4    McLogEvent

Or, when querying multiple documents:

1   5d4a9a3f-5a1c-478c-b12f-39e9f33172f4    Application Error
1   5d4a9a3f-5a1c-478c-b12f-39e9f33172f4    McLogEvent  
2   9784fcc3-ee7d-49a0-bf5d-4ac4fe923fc9    ESENT
2   9784fcc3-ee7d-49a0-bf5d-4ac4fe923fc9    Microsoft-Windows-Search
2   9784fcc3-ee7d-49a0-bf5d-4ac4fe923fc9    McLogEvent
3   cd1054b1-1fe0-4842-a9ed-3f1b3c09ea0f    Application Hang
4   6c95827b-8fba-42d7-b34e-fec0b9e9606a    Group Policy
4   6c95827b-8fba-42d7-b34e-fec0b9e9606a    Microsoft-Restart-Manager

In other words, I want to be able to query my XML document in such a way that each selected Node returns as a Row. Is there a way to achieve this using XMLDB and Oracle 11.2g?

P.S.: In an actual query I would of course retrieve more than just the provider, this is just an example query. I'm looking to build different views into the document to explore the different metrics contained in the documents

Best Answer

I found the answer myself after reading through loads of documentation from Oracle. The answer comes in two parts.

First, my original question can be solved by using XMLTable:

SELECT 
  x.ID            AS "MEASUREMENT_ID", 
  x.GUID          AS "MEASUREMENT_GUID", 
  a.COLUMN_VALUE  AS "MESSAGE"
FROM SPERF_XML_RAW x,
  XMLTable('/PerfInfo/EventLog/EventRecord/@Provider'
            PASSING x.DATA_XML) a;

Note that COLUMN_VALUE is a pseudocolumn that can be used in conjunction with XMLTable

As additional information I would like to add the following. You can create multiple columns with the following syntax:

SELECT 
    x.ID,
    x.GUID,
    res."Provider", 
    res."TimeCreated"
FROM SPERF_XML_RAW x,
    XMLTABLE('/PerfInfo'
      PASSING x.DATA_XML
      COLUMNS 
         "Provider"     VARCHAR2(32 CHAR)   PATH '/PerfInfo/EventLog/EventRecord',
         "TimeCreated"  TIMESTAMP           PATH '/PerfInfo/EventLog/EventRecord') res;

But Oracle will throw ORA-19279: XPTY0004 when one or more columns return multiple nodes.

ORA-19279: XPTY0004 - XQuery dynamic type mismatch: expected singleton sequence - got multi-item sequence
19279. 00000 -  "XQuery dynamic type mismatch: expected singleton sequence - got multi-item sequence" 
*Cause:    The XQuery sequence passed in had more than one item.
*Action:   Correct the XQuery expression to return a single item sequence.

The solution to THAT problem can be found below:

SELECT
  x.ID              AS "MEASUREMENT_ID",
  x.GUID            AS "MEASUREMENT_GUID",
  res."TIMECREATED", 
  res."LOGNAME",
  res."LEVEL",
  res."ID",
  res."PROVIDER",
  res."MESSAGE"
FROM SPERF_XML_RAW x,
   XMLTable('/PerfInfo' 
      PASSING x.DATA_XML
      COLUMNS
        "EventRecords" XMLTYPE PATH 'EventLog') rec,
   XMLTABLE ('EventLog/EventRecord'
       PASSING rec."EventRecords"
       COLUMNS
        "LOGNAME"         VARCHAR(32 CHAR)      PATH '@LogName',
        "ID"              NUMBER                PATH '@ID',
        "LEVEL"           CHAR(1 CHAR)          PATH '@Level',
        "PROVIDER"        VARCHAR2(128 CHAR)    PATH '@Provider',
        "TIMECREATED"     VARCHAR2(32 CHAR)     PATH '@TimeCreated',
        "MESSAGE"         CLOB                  PATH 'text()') res;

This will probably fail once more as soon as your document has too many levels, and I have not included a document with namespaces, which will most likely cause this to explode as well.

All information was found in the XML DB Developers guide: https://docs.oracle.com/cd/B28359_01/appdev.111/b28369/xdb03usg.htm#g1055369