Recently I attempt to search for a particular pattern by converting XML
data into varchar(max)
although I'm aware it's not the best practice and found out it's not working as expected:-
Setup
declare @container table(
[Response] xml not null
);
declare @xml xml =
'<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://abc.com/xsd" xmlns:ns="http://abc.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Header>
<ns:MessageHeader>
<xsd:ID>ABC</xsd:ID>
<xsd:Date>2018-12-31T23:59:59</xsd:Date>
</ns:MessageHeader>
</soapenv:Header>
<soapenv:Body>
<ns:MessageResponse>
<ns:return>
<xsd:ResponseList xsi:nil="true" />
</ns:return>
</ns:MessageResponse>
</soapenv:Body>
</soapenv:Envelope>';
insert into @container values (@xml);
This query works
select *
from @container
where cast(Response as varchar(max))
like '%<xsd:ResponseList xsi:nil="true"%';
notice the wildcard character ends 3 characters (i.e.' />'
) before the XML node
but this is not
select *
from @container
where cast(Response as varchar(max))
like '%<xsd:ResponseList xsi:nil="true" %' -- with space
or cast(Response as varchar(max))
like '%<xsd:ResponseList xsi:nil="true" />%' -- whole XML node;
I suspect this is probably due to escape characters and tried a few other alternatives but to no avail, appreciate if someone can shed some light on this.
EDIT (ANSWERED)
Following query would work based on Mr. Browstone's insight:-
select *
from @container
where cast(Response as varchar(max))
like '%<xsd:ResponseList xsi:nil="true"/>%';
Here's my follow question @ CodeReview with XQuery expression:-
T-SQL Verify whether XML node from SOAP request contains any child nodes
Best Answer
This is by design.
When you store a document using the XML data type it is compressed and organised into a structure that Sql Server can perform operations on efficiently. One of the steps that it goes through to do this is to generate the InfoSet. When it does this, it removes anything that it determines to not be necessary, in your example, whitespace:
When you select the entire contents of the field (such as when you are converting it to
NVARCHAR(MAX)
it rebuilds the XML document before returning it. This document may not be an identical copy of the document that you inserted. For example, if you have used self-closing elements, Sql Server may return opening and closing elements instead.The documentation also continues on to say:
So, if you want to store the exact copy of your document, then
NVARCHAR(MAX)
orVARCHAR(MAX)
is the best option. You can then convert it to XML to query it later on (though this can be costly).For more information, see the documentation on XML Data Type and Columns (SQL Server) and also Define the Serialization of XML Data which outlines the rules that Sql Server applies when converting XML to a string type.