Db2 – does xml element exist, regardless of value

db2db2-luwxml

EDIT: I demonstrated my ignorance by confusing attribute with element, I updated my question. Sorry. To clarify, I'm looking for the existence of the element code in:

<conditionmessage:code>MYCODE</conditionmessage:code>

I have a bunch of schemas where there is a table that contains an XML column. There are a handful XML indexes on each column. I'm pretty sure that the XML-schemas differs between the SQL-schemas, nevertheless, the same XML-indexes exists in each schema. I suspect copy/paste and I can't seem to find any documentation for the XML-schemas. So I thought of a more pragmatic approach namely to loop through the XML columns and check whether each element exists or not. The tables have been around for a while, so every possible element should be there by now.

Now to the question, what's the best way to check for the existence of an element, regardless of value? Unfortunately, my brain is incompatible with XPATH/XQUERY, and I have to struggle every time to just convince my self that the query does what it is supposed to do. My current approach seems to work, but I'm curious if there is a better way. The cardinality of the smallest table is ~100.000 rows and the largest ~100.000.000 rows:

import ibm_db
import ibm_db_dbi
...
conn = ibm_db_dbi.connect("...")
c1 = conn.cursor()

for schema in [ "S1", "S2", ... ]:
    for attribute in [ "a1", "s2", ...]:
        sql = f"""
            select * from {schema}.ATOM_ENTRY
            WHERE XMLEXISTS('$d//*:{attribute}[fn:matches(text(), $p, "i")]' 
                PASSING ENTRY_XML AS "d", CAST('\S*' AS VARCHAR(128)) as "p")
            fetch first 1 rows only
            optimize for 1 rows"""
        c1.execute(sql)
        exists = False
        for row in c1.fetchall():
            exists = True
        print(f"{schema}.ATOM_ENTRY -> {attribute} : {exists}")

Best Answer

Try this:

For attributes

SELECT COUNT(x.a2) cnt_a2, COUNT(x.a4) cnt_a4
FROM
(
VALUES 
  '<doc><a><b a2=""><c a2=""/></b></a></doc>'
, '<doc><a><b a3=""/></a></doc>'
) T (D)
, XMLTABLE
(
  '$doc' PASSING XMLPARSE(DOCUMENT T.D) AS "doc"
  COLUMNS
    a2 XML PATH '//*[fn:exists(@a2)]'
  , a4 XML PATH '//*[fn:exists(@a4)]'
) X 
WHERE 
--XMLEXISTS('$doc//*[@a4 or @a5]' PASSING XMLPARSE(DOCUMENT T.D) AS "doc")
XMLEXISTS('$doc//*[@a2 or @a4]' PASSING XMLPARSE(DOCUMENT T.D) AS "doc")
;

The result is:

|CNT_A2     |CNT_A4     |
|-----------|-----------|
|1          |0          |

For elements with namespaces

SELECT COUNT(x.code1) cnt_code1, COUNT(x.code3) cnt_code3
FROM
(
VALUES 
  '<doc xmlns:conditionmessage="http://xyz.com"><conditionmessage:code1>MYCODE</conditionmessage:code1></doc>'
, '<doc xmlns:conditionmessage="http://xyz.com"><conditionmessage:code2>MYCODE</conditionmessage:code2></doc>'
) T (D)
, XMLTABLE
(
  XMLNAMESPACES('http://xyz.com' AS "conditionmessage"), '$doc' PASSING XMLPARSE(DOCUMENT T.D) AS "doc"
  COLUMNS
    code1 XML PATH '//*[fn:exists(conditionmessage:code1)]'
  , code3 XML PATH '//*[fn:exists(conditionmessage:code3)]'
) X
WHERE XMLEXISTS('declare namespace conditionmessage = "http://xyz.com"; $doc//*[conditionmessage:code1 or conditionmessage:code3]' PASSING XMLPARSE(DOCUMENT T.D) AS "doc");

The result is:

|CNT_CODE1  |CNT_CODE3  |
|-----------|-----------|
|1          |0          |