I have a table named data with a column pid & display_data.
Pid is the serial number of all rows.
The value of display_data column is as below :
For each row display_Data has values in the format =>
For PID= 1:
The labels are constant for all rows which would become column headers. Values are changing with respect to labels.
There are total 400 such rows hence, the query needs to be dynamic.
I want to parse this column value,strip html tags and into a new view split it using sql query(Regexp_Replace maybe) such that:
All label values become columns, i.e.
The actual data is much more with many labels and values however this is just sample to get help.
Best Answer
TL;DR Answer
Convert HTML to XML and parse with
XMLTABLE()
Long Answer
HTML and XML are both descendants of SGML.
From the sample data you have posted, it looks like you just need to remove the
<br>
tags and wrapdesc_data
in a root element (eg<row>
) for the conversion into XML.Once in XML format, you can parse the data with
XMLTABLE()
to give you:<span>
( viafor ordinality
, column nameRN
)@label
<span>
This is done within the CTE named
xml_data
in my example.Assuming
value
always followslabel
, you can convert the instance of<span>
(RN
) to have the same value for each pair. (this is also done inxml_data
)From there, you can
PIVOT
the data into a table that has:RID
)label
(LABEL
is a key word so I usedLABEL_TXT
)value
(VALUE
is a key word so I usedVALUE_TXT
)This is done within the CTE labeled
key_val
in my example.For the final query, the final columns names MUST BE KNOWN at compile time. Once you know the columns, you can just do another
PIVOT
for a proper table.Notes
As you can see, HTML is not a good data format to use for parsing data and/or using the data within a computer (eg a database).
Unless
desc_data
is used for display, I recommend that the data is stored in XML or JSON format. (I'm assuming you can change the source)Once data is in a parseable format, PL/SQL is no longer needed.
Example Code