SQL Server – How Derived Table with CROSS APPLY Works

cross-applyctefunctionssql server

I borrowed some code on how to compact and uncompact a string of diagnosis codes in order to build a bridge table for my data warehouse. It works well. However, I just do not understand how it is doing the uncompacting. Here Is the SQL Fiddle for the below code

create table dimDiagnosisGroup (dxGroupKey int, dxCodeList nvarchar(1024))
insert into dimDiagnosisGroup
values  (1,'042, 070.70, 722.10'),
        (2,'042, 070.70, 780.52, 496, 716.90, 581.9'),
        (3,'042, 070.70, 782.1, V58.69'),
        (4,'042, 070.70, 782.3, V58.69')


 WITH XMLTaggedList AS (
     SELECT dxGroupKey,
        CAST('<I>' + REPLACE(dxCodeList, ', ', '</I><I>') + '</I>' AS XML)
            AS Diagnosis_Code_List
     FROM dimDiagnosisGroup
 )
     SELECT dxGroupKey,
            ExtractedDiagnosisList.X.value('.', 'VARCHAR(MAX)') AS dxCodeList2
          FROM XMLTaggedList
        CROSS APPLY Diagnosis_Code_List.nodes('//I') AS ExtractedDiagnosisList(X)

I understand the XMLTaggedList part fine. What I am not understanding is the Cross Apply to the ExtractedDiagnosisList(X), then the ExtractedDiagnosisList.X.value('.', 'VARCHAR(MAX)').

When I hover over ExtractedDiagnosisList in the select statement, SSMS says it is a derived table. However, it kind of looks like a function to me. I am not understanding how Diagnosis_Code.List gets the .nodes('//I') function. Finally, the ExtractedDiagnosisList.X.value section just looks foreign to me, in SQL. It looks like syntax from a language like C#.

Best Answer

What you`re seeing is the XQuery implementation in SQL Server . Although XQuery uses its own parser and performs its own algebrarization during the query compilation stage, the results are combined and optimized together with the DML portion of the query, then combined into a single execution plan.

SQL Server supports five different methods. value , exist , query , and nodes are used to access and transform the data. Last one, modify , uses XML DML to modify the data.

The value() method returns a scalar value from the XML instance Lets say you have the xml : `

declare @X xml =
'<Order OrderId="42" OrderTotal="49.96">
<Customer Id="123"/>
<OrderLineItems>
<OrderLineItem>
<ArticleId>250</ArticleId>
<Quantity>3</Quantity>
<Price>9.99</Price>
</OrderLineItem>
</OrderLineItems>
</Order>’

select @X.value('/Order[1]/Customer[1]/@Id','int')`or
select @X.value('(/Order/Customer/@Id)[1]','int')

Would give you ID of the first customer from the first order

In your particular case value('.') means give me all values from the shredded element( that i will talk about in a moment) Remember in using value function you`re moving through XML. Now to make it more easier ,not to 'move' too much you can use function node which shreds XML into relational data. It returns a row set with rows representing the nodes identified by the path expression.

Example:

declare @X xml =
'<Order OrderId="42" OrderTotal="49.96">
<Customer Id="123"/>
<OrderLineItems>
<OrderLineItem>
<ArticleId>250</ArticleId>
<Quantity>3</Quantity>
<Price>9.99</Price>
</OrderLineItem>
</OrderLineItems>
</Order>'


select C.t.value('(@Id)[1]','int')
from  @X.nodes('/Order/Customer') as C(t)

When you use the nodes() method with the XML column from the table, you must use the APPLY operator.

Example:

declare @X as table (xmlCode varchar(1000))
insert into @X values (
'<Order OrderId="42" OrderTotal="49.96">
<Customer Id="123"/>
<OrderLineItems>
<OrderLineItem>
<ArticleId>250</ArticleId>
<Quantity>3</Quantity>
<Price>9.99</Price>
</OrderLineItem>
</OrderLineItems>
</Order>');

WITH CTE AS(
select CAST(xmlCode as XML) as X from @X
)
select C.t.value('(@Id)[1]','int')
from  CTE
CROSS APPLY X.nodes('/Order/Customer') as C(t)

I used examples on value and node, because you provided the code with those two functions only, if you want to read more about it please visit this

Hope this simple examples give you an idea of how to query xml types