Sql-server – Actual Execution Plan – Estimated/Actual Rows Differ Greatly

execution-plansql serversql-server-2012

I'm wondering if I can speed this query up. One thing I noticed is the estimated number of rows differs greatly from the actual number of rows.

https://www.brentozar.com/pastetheplan/?id=H1ya3-3pe

Best Answer

ClientRequestID in tblCR_ClientRequestTIP has no statistics, and SQL Server would love to have some. You can see that warning in the actual execution plan. This leads me to believe perhaps you have auto create statistics turned off. This may be causing issues. Have you updated statistics recently?

I'd consider adding the following indexes:

CREATE NONCLUSTERED INDEX IX_tblCR_ClientRequestTIP_ClientRequestID
ON dbo.tblCR_ClientRequestTIP(ClientRequestID)
INCLUDE (TIP_BeginDate, TIP_LocationName1);

CREATE NONCLUSTERED INDEX IX_tblCR_ClientRequestJOB_ClientRequestID
ON tblCR_ClientRequestJOB(ClientRequestID)
INCLUDE (JOB_Date);

CREATE NONCLUSTERED INDEX IX_tblCR_ClientRequestPP_ClientRequestID
ON dbo.tblCR_ClientRequestPP(ClientRequestID)
INCLUDE (PP_BeginningDate)

(there are others that could be added as well, this is just a sample)

Also of note, tblCommitment, tblStaff, and tblProgram do not have clustered indexes.

The tblClientRequestJOB row estimates are not exactly perfect, but the actual data size vs the estimated memory required for that table shows that row estimate is not really problematic. I doubt SQL Server would choose a different plan if it knew the actual row counts ahead of time in that instance. A far bigger change is likely to come from getting rid of the eager index spool resulting from the clustered index scan of tblCR_ClientRequestTIP.

Related Solutions

Sql-server – Estimated vs Actual rows differences (actual much smaller than estimated) – sort

There is no statistics generated on XML columns. The estimates is guessed based on the expressions used when querying the XML.

Using this table:

create table T(XMLCol xml not null)
insert into T values('<root><item value = "1" /></root>')

And this rather simple XML query:

select X.N.value('@value', 'int')
from T
  cross apply T.XMLCol.nodes('root/item') as X(N)

Will give you one row returned but the estimated rows returned is 200. It will be 200 regardless of what XML or how much XML you stuff into the XML column for that one row.

This is the query plan with the estimated row count displayed.

enter image description here

A way to improve, or at least change, the estimates is to give the query optimizer some more information about the XML. In this case, because I know that root really is a root node in the XML, I can rewrite the query like this.

select X2.N.value('@value', 'int')
from T
  cross apply T.XMLCol.nodes('root[1]') as X1(N)
  cross apply X1.N.nodes('item') X2(N)

That will give me an estimate of 5 rows returned.

enter image description here

The rewrite of the query will probably not speed up the shredding of the XML but if the estimates are better, chances are that the query optimizer can make smarter decisions for the rest of the query.

I have not found any documentation on what the rules are for the estimates other than a presentation by Michael Rys where he says:

Base cardinality estimate is always 10’000 rows!
Some adjustment based on pushed path filters

Sql-server – Estimated vs Actual Number of Rows

Up to date statistics does not necessarily imply perfect estimation. If you try to represent a range of 170 million values with 200 samples, even a small skew can throw you off by a factor of 10.

Here is a hint: time series are almost always accessed by time. Very likely DateStamp should be clustered key. For more detailes, post the exact DDL and actual execution plan (XML, not picture).

Best Answer

Related Solutions

Sql-server – Estimated vs Actual rows differences (actual much smaller than estimated) – sort

Sql-server – Estimated vs Actual Number of Rows

Related Question