Sql-server – How to interpret this XML plan fragment

explainsql serversql-server-2008-r2

I am looking at the query plan produced by SQL Server 2008 R2 and I'm not sure I'm interpreting it properly. Examples:

<RelOp NodeId="2" PhysicalOp="Sort" LogicalOp="Sort" EstimateRows="1" EstimateIO="0.0112613" 
 EstimateCPU="0.000100822" AvgRowSize="822" EstimatedTotalSubtreeCost="1.06489" Parallel="0" 
 EstimateRebinds="0" EstimateRewinds="0">
 ...
  <MemoryFractions Input="0.5" Output="1" />
  <RunTimeInformation>
    <RunTimeCountersPerThread Thread="0" ActualRows="10190123" ActualRebinds="1" ActualRewinds="0" ActualEndOfScans="1" ActualExecutions="1" />
  </RunTimeInformation>

and

<RelOp NodeId="6" PhysicalOp="Index Seek" LogicalOp="Index Seek" EstimateRows="1" EstimateIO="1" 
 EstimateCPU="0.0502411" AvgRowSize="24" EstimatedTotalSubtreeCost="1.05024" TableCardinality="4.93715e+009" 
 Parallel="0" Partitioned="1" EstimateRebinds="0" EstimateRewinds="0">
 ...
  <RunTimeInformation>
    <RunTimeCountersPerThread Thread="0" ActualRows="10190123" ActualEndOfScans="1" ActualExecutions="1" />
  </RunTimeInformation>

Specifically, seeing EstimateRows="1" and ActualRows="10190123" I suspect that the optimizer incorrectly estimates the number of rows that will be processed. Subsequently, underestimated row set for the sort might lead to a major sort spill.

Edit:

The query in question looks similar to

select whatever from mytable
where integer_column1 > 848484884 and
      integer_column1 < 949494949 and
      another_int_col in (5, 8, 15, 20)

where the cardinality of integer_column1 is in millions and that of another_int_col is a few dozen (a flag of sorts). There is a non-unique, non-clustered index on (integer_column1, another_int_col). I wonder if even in the presence of updated statistics the optimizer can use the histogram/density vector of that index though.

Does my reasoning make sense?

(I don't have access to the actual system or the Management Console for that matter — only the XML plan)

Best Answer

To answer my own question, my original understanding was correct: the discrepancy between EstimateRows="1" and ActualRows="10190123" was an indication of incorrect optimizer estimates, caused by stale statistics (as suggested by commenters). With updated statistics the node in the plan corresponding to node 6 in the question looks like so:

<RelOp NodeId="25" PhysicalOp="Index Seek" LogicalOp="Index Seek" 
 EstimateRows="1.25676e+007" EstimateIO="23.8363" EstimateCPU="1.78132" AvgRowSize="24" 
 EstimatedTotalSubtreeCost="25.6176" TableCardinality="4.94087e+009" Parallel="1" 
 Partitioned="1" EstimateRebinds="0" EstimateRewinds="0">
...
<RunTimeInformation>
  <RunTimeCountersPerThread Thread="2" ActualRows="3457399" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="8" ActualRows="0" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="1" ActualRows="0" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="7" ActualRows="0" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="5" ActualRows="8053198" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="3" ActualRows="0" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="6" ActualRows="0" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="4" ActualRows="0" ActualEndOfScans="1" ActualExecutions="1" />
  <RunTimeCountersPerThread Thread="0" ActualRows="0" ActualEndOfScans="0" ActualExecutions="0" />
</RunTimeInformation>

Although the problem could be seen in the XML plan representation directly, for those who want a graphical representation there exist a free tool that does not require the SQL Server Management Studio, SQL Sentry Plan Explorer, or even Windows for that matter -- it is an XSL stylesheet created by the StackOverflow user Justin.

Related Solutions

SQL Server Execution Plan – SHOWPLAN vs ‘Include Execution Plan’ Warnings for the Same Query

This:

SET SHOWPLAN_XML ON;
GO
SELECT * FROM sys.objects;
GO

Is equivalent to pressing Display Estimated Execution Plan on the toolbar (or hitting Ctrl + L). You'll notice that no rows are returned from the query, like there is when you use Include Actual Execution Plan (Ctrl + M).

The spill warning is only a runtime warning. There is no way that SQL Server can know, when displaying the estimated plan, that a spill will happen at runtime. This is because a spill is caused by factors that might only be present during certain invocations of the query (for example, when there is memory pressure). The estimated plan knows roughly how much memory it's going to ask for, but it can't know until execution that it isn't going to get it.

As an aside, may I recommend* our free tool, SQL Sentry Plan Explorer? I think it provides much more obvious information than Management Studio. I recently wrote a lengthy blog post that can act as a tutorial, and Jonathan Kehayias has a great PluralSight course on it as well.

_{* Disclaimer: I work for SQL Sentry.}

Sql-server – Why/when does the SQL Server query optimizer use EQ_ROWS

My best guess is, the optimizer uses EQ_ROWS every time the predicate equals (hence the EQ) the value in the sampled histogram.

Best Answer

Related Solutions

SQL Server Execution Plan – SHOWPLAN vs ‘Include Execution Plan’ Warnings for the Same Query

Sql-server – Why/when does the SQL Server query optimizer use EQ_ROWS

Related Question