Database design:
Table Name : Plays ( 6 million records)
Columns: Play id, col1, col2……col7, PlayFieldValues
PlayFieldValues xml column. sample xml stored in this column
<PlayAttributes>
<PlayFields>
<PlayField ID="Play.GenericInt1" >25</PlayField>
<PlayField ID="X" TS="">0</PlayField>
<PlayField ID="Y" >OFF</PlayField>
</PlayFields>
</PlayAttributes>
Each xml has more than 25 elements
Using following query I am returning all Playid's from plays table having "Play.GenericInt1" attribute value 25 (we can add more conditions in where clause. This is just an example.)
Select
playId
FROM
(
SELECT playId, x.v.value('@ID[1]', 'VARCHAR(100)') AS playfield ,
x.v.value('.', 'VARCHAR(100)') AS playfieldvalue
from plays
cross apply PlayFieldValues.nodes('/PlayAttributes/PlayFields/PlayField') x(v)
)T
where
T.playfield in('Play.GenericInt1') and T.playfieldvalue=25
Problem: This query is working great on plays table with 1k or 2k rows, but it is not scaling when running on 5 million rows table. execution time is crossing 20 min. Please advice what types of methods and procedures in sql I can use to search the xml column. I tried XML indexes but it costing us lot of space and maintenance problems.
Best Answer
You may improve your performance by using xml exist, instead of extracting the values.
...but without any xml indexes this is still going to involve a table scan.