SQL Server – XML Queries with Case Insensitive Tag

sql serversql-server-2008-r2xmlxquery

I have a requirement where i want XML tags to be case insensitive in my xml queries.

For example:

DECLARE @myTable TABLE ( yourXML XML )  

INSERT INTO @myTable SELECT '<z><a><b>1</b><c>2</c></a></z>'
INSERT INTO @myTable SELECT '<Z><A><b>1</b><c>2</c></A></Z>'   

SELECT * FROM @myTable WHERE ( [yourXML].exist('for $x in /z/a where  ( ($x/b[1]) = 1 )  return $x')>0 )

Returns output as

yourXML
<z><a><b>1</b><c>2</c></a></z>

But i want the output to display without considering case of the XML tag. So my desired output should be

yourXML
<z><a><b>1</b><c>2</c></a></z>
<Z><A><b>1</b><c>2</c></A></Z>

How it can be done?

Best Answer

Use predicates with name tests like this instead of "simple" node steps:

/*[lower-case(local-name(.)) = 'z']/*[lower-case(local-name(.)) = 'a']

Related Solutions

SQL Server – How to Import XML Data with ID Relationships

You can use a variant of the technique described in Using merge..output to get mapping between source.id and target.id and by Adam Machanic in Dr. OUTPUT or: How I Learned to Stop Worrying and Love the MERGE.

You use merge and capture the generated ID with the XML fragment that belongs to that ID in a table variable and then use that table variable when adding rows to the child tables.

declare @C table
(
  CONSTRUCTION_ID int primary key,
  PROJECT xml
);

merge CONSTRUCTION as T
using (
      select T.X.value('(constructionName/text())[1]', 'varchar(30)') as CONSTRUCTIONNAME,
             T.X.value('(constructionType/text())[1]', 'varchar(30)') as CONSTRUCTIONTYPE,
             T.X.query('project') as PROJECT
      from @xml.nodes('/root/construction') as T(X)
      ) as S
on 0 = 1
when not matched by target then
  insert (CONSTRUCTIONNAME, CONSTRUCTIONTYPE) 
  values (S.CONSTRUCTIONNAME, S.CONSTRUCTIONTYPE)
output inserted.CONSTRUCTION_ID, S.PROJECT into @C;

declare @P table
(
  PROJECT_ID int primary key,
  BUSINESSPARTNERS XML
);

merge PROJECT as T
using (
      select C.CONSTRUCTION_ID,
             T.X.value('(projectname/text())[1]', 'varchar(30)') as PROJECTNAME,
             T.X.query('businesspartners') as BUSINESSPARTNERS
      from @C as C
        cross apply C.PROJECT.nodes('/project') as T(X)
      ) as S
on 0 = 1
when not matched by target then
  insert (CONSTRUCTION_ID, PROJECTNAME)
  values(S.CONSTRUCTION_ID, S.PROJECTNAME)
output inserted.PROJECT_ID, S.BUSINESSPARTNERS into @P;

insert into BUSINESSPARTNERS(PROJECT_ID, PARTNERNAME)
select P.PROJECT_ID,
       T.X.value('text()[1]', 'varchar(30)')
from @P as P
  cross apply P.BUSINESSPARTNERS.nodes('/businesspartners/partnername') as T(X);

SQL Fiddle

SQL Server 2008 R2 – Best Case Insensitive XML Query

You can easily test the performance yourself.

Create a regular table that you can test your queries on.

create table myTable ( yourXML XML )

Add a couple of rows that will give you a match.

INSERT INTO myTable SELECT '<z><a><b>1</b><c>2</c></a></z>'
INSERT INTO myTable SELECT '<Z><A><b>1</b><c>2</c></A></Z>' 
INSERT INTO myTable SELECT '<Z><A><B>1</B><c>2</c></A></Z>'

Add a whole bunch of rows that will not match in different parts of the XML.

insert into myTable 
select top(10000) '<X><A><B>1</B><c>2</c></A></X>'
from sys.all_objects as o1, sys.all_objects as o2

insert into myTable 
select top(10000) '<Z><X><B>1</B><c>2</c></X></Z>'
from sys.all_objects as o1, sys.all_objects as o2

insert into myTable 
select top(10000) '<Z><A><X>1</X><c>2</c></A></Z>'
from sys.all_objects as o1, sys.all_objects as o2

Use SET STATISTICS IO (Transact-SQL) and SET STATISTICS TIME (Transact-SQL) and execute your queries in SQL Server Management Studio.

set statistics time on
set statistics io on

-----Solution 1----------

SELECT * FROM myTable WHERE ( [yourXML].exist('for $x in /*[lower-case(local-name(.)) = "z"]/*[lower-case(local-name(.)) = "a"] where  ( ($x/*[lower-case(local-name(.)) = "b"][1]) = 1 )  return $x')>0 ) 

-------------------------
-----Solution 2----------

SELECT * FROM myTable
WHERE
(CONVERT(XML,LOWER(CONVERT(VARCHAR(MAX),[yourXML]))).exist('for $x in /z/a where  ( ($x/b[1]) = 1 )  return $x')>0 )

-------------------------
-----Solution 3----------

SELECT * FROM myTable WHERE 
([yourXML].exist('for $x in (/Z/A,/z/a) where  ( ($x/b[1],$x/B[1]) = 1 )  return $x') > 0 ) 

-------------------------
-----Solution 4----------

select *
from myTable
where yourXML.exist('(Z/A,z/a)[(b,B)=1]') = 1

Switch to the message tab and evaluate execution time and required reads.

Presumably you have better data in your database to test on. The performance characteristic will change depending on the actual structure of the XML you have to deal with in the real world.

A side note is that your queries are not equivalent.
The first is easily adapted to longer element names.
The second query changes the content of the elements, not just the element names.
The third query does not really deal with case insensitive element names, it just enumerate all possible element names that in this case happens to be different in cases only. If you wanted to deal with lets say a 3 letter element name with your third solution you would have 8 (I think) different permutations to deal with.

I added a fourth solution, mostly because it is short and pretty, that has the same limitations as your third solution. It is in my measurements slightly faster than using FLWOR.

Best Answer

Related Solutions

SQL Server – How to Import XML Data with ID Relationships

SQL Server 2008 R2 – Best Case Insensitive XML Query

Related Question