There is no statistics generated on XML columns. The estimates is guessed based on the expressions used when querying the XML.
Using this table:
create table T(XMLCol xml not null)
insert into T values('<root><item value = "1" /></root>')
And this rather simple XML query:
select X.N.value('@value', 'int')
from T
cross apply T.XMLCol.nodes('root/item') as X(N)
Will give you one row returned but the estimated rows returned is 200. It will be 200 regardless of what XML or how much XML you stuff into the XML column for that one row.
This is the query plan with the estimated row count displayed.
A way to improve, or at least change, the estimates is to give the query optimizer some more information about the XML. In this case, because I know that root
really is a root node in the XML, I can rewrite the query like this.
select X2.N.value('@value', 'int')
from T
cross apply T.XMLCol.nodes('root[1]') as X1(N)
cross apply X1.N.nodes('item') X2(N)
That will give me an estimate of 5 rows returned.
The rewrite of the query will probably not speed up the shredding of the XML but if the estimates are better, chances are that the query optimizer can make smarter decisions for the rest of the query.
I have not found any documentation on what the rules are for the estimates other than a presentation by Michael Rys where he says:
Base cardinality estimate is always 10’000 rows!
Some adjustment based on pushed path filters
Up to date statistics does not necessarily imply perfect estimation. If you try to represent a range of 170 million values with 200 samples, even a small skew can throw you off by a factor of 10.
Here is a hint: time series are almost always accessed by time. Very likely DateStamp
should be clustered key. For more detailes, post the exact DDL and actual execution plan (XML, not picture).
Best Answer
ClientRequestID
intblCR_ClientRequestTIP
has no statistics, and SQL Server would love to have some. You can see that warning in the actual execution plan. This leads me to believe perhaps you haveauto create statistics
turned off. This may be causing issues. Have you updated statistics recently?I'd consider adding the following indexes:
(there are others that could be added as well, this is just a sample)
Also of note,
tblCommitment
,tblStaff
, andtblProgram
do not have clustered indexes.The tblClientRequestJOB row estimates are not exactly perfect, but the actual data size vs the estimated memory required for that table shows that row estimate is not really problematic. I doubt SQL Server would choose a different plan if it knew the actual row counts ahead of time in that instance. A far bigger change is likely to come from getting rid of the eager index spool resulting from the clustered index scan of tblCR_ClientRequestTIP.