Sql-server – Parsing Typed XML with a default namespace and inconsistent root node

sql serversql-server-2012xmlxquery

I have a structured, typed xml file.

<WebService xmlns="http://www.orbis-software.com/WebSvcCon">
 <NewLeads>
  <OutputSchema>
  <root xmlns="" type="array">
   <item type="object">
   <SaleProperty type="object">
    <Type type="string">Freehold</Type>
    <PostDistrict type="string">xxx</PostDistrict>
    <Address type="string">address</Address>
     <Value type="number">17.0</Value>
   </SaleProperty>
   <Transaction type="string">SaleOnly</Transaction>
   <Quote type="object">
    <Sale type="object">
     <Fees type="number">450.0</Fees>
     <Disbursements type="number">0.0</Disbursements>
     <StampDuty type="number">0.0</StampDuty>
     <Total type="number">450.0</Total>
    </Sale>
   <Discount type="number">0.0</Discount>
   <VAT type="number">90.0</VAT>
   <Total type="number">540.0</Total>
  </Quote>
  <MoverId type="number">12345678</MoverId>
  <Name type="string">Mr AS</Name>
  <Email type="string">as@yahoo.ppp</Email>
  <Telephone type="string">0123456789</Telephone>
  <Comments type="string">Joint ownership</Comments>
  <EstimatedMoveDate type="string">2015-11-25T05:57:00</EstimatedMoveDate>
  <Charge type="number">4.99</Charge>
  <ChargeStatus type="string">Chargeable</ChargeStatus>
 </item>
</root>
</OutputSchema></NewLeads></WebService>

Additionally, object SaleProperty may be missing, instead may be object PurchaseProperty. Similarly, object Quote can contain Sale, Purchase, Remortgage objects or sale and Purchase simultaneously
I can't find any tips on inserting it into one table via xquery on SQL Server 2012. Basic problem is that every time I'm building query to extract even one value (i.e. Transaction) only thing except errors I get is empty string.

my sample query (based on MSDN examples):

SELECT t.c.value('(.)[1]','varchar(50)') as type
from @ixml.nodes('/root/item/transaction') as t(c)

I would like to have an example how to insert this xml into table (where every possible element has it's own column.

I assume that problem is with namespaces and strong typing of the xml. However, this is what I get from web service. I work on xml both missing and containing second xmlns declaration in root element, as well as with structure trimmed down to just root.

Best Answer

The three problems with your XQuery that I can see are (and none of this has anything to do with it being Typed XML):

  1. You are not specifying the correct path to the /root node. It should be:

    SELECT t.c.value('(.)[1]','varchar(50)') as type
    from @ixml.nodes('/WebService/NewLeads/OutputSchema/root/item/transaction') as t(c)
    
  2. XML is case-sensitive so you need to use an upper-case "T" for the "Transaction" node:

    SELECT t.c.value('(.)[1]','varchar(50)') as type
    from @ixml.nodes('/WebService/NewLeads/OutputSchema/root/item/Transaction') as t(c)
    
  3. While those fixes would probably get you 1 value back (in this case, it should be "SaleOnly"), it won't iterate through the "item" nodes because you were too specific in the path supplied to the .nodes() function. Instead, the final node of that specification should be "item", which is what you want to iterate through. And in that case, you move the "Transaction" part up to the .value() function:

    SELECT t.c.value('(./Transaction)[1]','varchar(50)') as type
    from @ixml.nodes('/WebService/NewLeads/OutputSchema/root/item') as t(c)
    

Regarding the statement of:

I work on xml both missing and containing second xmlns declaration in root element, as well as with structure trimmed down to just root.

The "structure trimmed down to just root" should be handleable by removing everything between the first / and the / right before /root... (i.e. the WebService/NewLeads/OutputSchema). So the resulting path would be:

from @ixml.nodes('//root/item') as t(c)

NOTE:
I am not able to get this to work 100% with the namespace declared in the <WebService> element (please see additional notes as this is no longer the case). Taking that out makes it work. Giving it a prefix, such as xmlns:something="http://www.orbis-software.com/WebSvcCon" makes it work, but then that needs to be declared in the methods. The only way I can get it to work right now is by declaring the default namespace in each XML function (.nodes and .value) as follows:

SELECT t.c.value('declare default element namespace "http://www.orbis-software.com/WebSvcCon";
                  (./Transaction)[1]','varchar(50)') as [type]
from @ixml.nodes('declare default element namespace "http://www.orbis-software.com/WebSvcCon";
                  /WebService/NewLeads/OutputSchema/root/item') as t(c)

NOTE 2:
Even better, you can use WITH XMLNAMESPACES to declare one or more namespaces to use for the entire query, so no need to define in each XML function. The following both work:

;WITH XMLNAMESPACES (DEFAULT 'http://www.orbis-software.com/WebSvcCon')
SELECT t.c.value('(./Transaction)[1]','varchar(50)') as [type]
from @ixml.nodes('/WebService/NewLeads/OutputSchema/root/item') as t(c)


;WITH XMLNAMESPACES (DEFAULT 'http://www.orbis-software.com/WebSvcCon')
SELECT t.c.value('(./Transaction)[1]','varchar(50)') as [type]
from @ixml.nodes('//root/item') as t(c)

However, just keep in mind that if the document is missing the <WebService xmlns="http://www.orbis-software.com/WebSvcCon"> element and hence has no default namespace, then you need to remove the ;WITH XMLNAMESPACES part. Of course, if the <root> element has its own default namespace, then maybe you will need to keep it. You can play around with it until it works, now that you know the connection between these pieces.

NOTE 3:
If you do end up having two default namespaces declared -- one in the <WebService> element and one in the <root> element -- then you need to specify the URI noted in <root xmlns="bob"> and the // syntax instead of the fully-qualified path. So if your XML looked like:

<WebService xmlns="http://www.orbis-software.com/WebSvcCon">
 <NewLeads>
  <OutputSchema>
  <root xmlns="http://someplace" type="array">

You would then use:

;WITH XMLNAMESPACES (DEFAULT 'http://someplace')
SELECT t.c.value('(./Transaction)[1]','varchar(50)') as [type]
from @ixml.nodes('//root/item') as t(c)

But that won't help if you do have the <WebService> element and yet the <root> element is missing the xmlns declaration. In that case you still need to specify the namespace noted in the <WebService> element. Fun, fun, fun :-).

NOTE 4:
Even more better: incorporating something mentioned in @wBob's answer, we can actually get rid of the ;WITH XMLNAMESPACES clause, and instead use a namespace wildcard. You just need to prefix every node of every XML function with *:. Now the query should look as follows:

SELECT t.c.value('(./*:Transaction)[1]','varchar(50)') AS [type],
       t.c.value('(./*:SaleProperty/*:PostDistrict)[1]','varchar(50)') AS [PostDistrict]
FROM  @ixml.nodes('//*:root/*:item') t(c);

Doing this means that the query works in all of your scenarios:

  1. Full structure starting with "WebService" node, second xmlns declaration:

    <WebService xmlns="http://www.orbis-software.com/WebSvcCon">
      <NewLeads>
        <OutputSchema>
          <root xmlns="uri" type="array">
    
  2. Full structure starting with "WebService" node, single xmlns declaration:

    <WebService xmlns="http://www.orbis-software.com/WebSvcCon">
      <NewLeads>
        <OutputSchema>
          <root type="array">
    
  3. Trimmed down structure starting with "root" node, single xmlns declaration:

    <root xmlns="uri" type="array">