Sql-server – TSQL Execution Plan – Estimated Number of Rows = 1 – Poor Performing Query

performanceperformance-tuningsql-server-2008-r2t-sqltuning

TSQL MSSQL 2008r2

I am trying to tune a poor performing query.

For 3 out of 4 of the largest tables (several millions of rows) being queried (index seeks), the plan has decided that the estimated number of rows = 1. I think this is where the issue is. The number of logical reads is crazy high which I think is as a result.

Logical Reads

I don't know what is making the query engine assume this is the case. Statistics are updated and I'm not using any user defined functions. The query engine is not trying to "convert_implicit" any columns. So what do I need to look at, change to make this query more efficient.

The plan is here, https://www.brentozar.com/pastetheplan/?id=S18GOLFig

The variables are missing from Paste the Plan. Here they are:

DECLARE  @1stDay_StartExtract   DATE
        ,@LastDay_EndExtract    DATETIME
        ,@WhenProcessLastTime   DATETIME;

SET @1stDay_StartExtract= DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-3, 0); 
SET @LastDay_EndExtract = DATEADD(SECOND,-1,DATEADD(MONTH, DATEDIFF(MONTH,0,GETDATE())-1,0)); 

DECLARE  @dtStartDate_LT DATETIME= DATEADD( DAY, -1,@1stDay_StartExtract )
        ,@dtEndDate_LT   DATETIME= DATEADD( DAY, 1, @LastDay_EndExtract );

Let me know if you need more information.

Thank you in advance.

Best Answer

I know of three main ways of addressing a query performance issue caused by a cardinality mis-estimate:

1. Giving the optimizer more information

The query optimizer generally works better if it has higher quality information to inform the model. Steps here can include updating statistics, creating new statistics, using the RECOMPILE hint to pass along the literal values or variables, or materializing key intermediate result sets to provide better cardinality estimates or indexing.

  1. Rewriting your query to be more clear to the optimizer

This can include simplifying code to remove redundant filters or refactoring it to be more clear to the optimizer. The query looks complex and we don't have the view code so it's hard to say more. There are a few filters in the query that appear to be extremely complex. It wouldn't surprise me at all that the optimizer cannot do a good job with guessing how those filters will affect the results.

3. Taking advantage of SQL Server enhancements

Sometimes there are features that you can turn that will make SQL Server do a better job with your workload. If you aren't using trace flag 4199 you could test this query with it. Trace flag 4199 is a collection of query optimization fixes that Microsoft has done over the years. It is on my default in SQL Server 2016. Trace flag 2301 is a bit less straightforward. It makes some changes to the optimizer around join cardinality estimate and in a rough sense you can say that the optimizer works harder to find a better plan. It is riskier and not nearly as common as trace flag 4199. Might not be practical but worth mentioning that each new version of SQL Server makes changes to improve query performance. In SQL Server 2014 there is a new cardinality estimator model which works better for some workloads.

For your particular query, I also want to note that it's easy to misread the single row estimate that you're seeing. The estimated number of rows that you see on the inner side of the nested loop is the number of rows returned per iteration of the loop. Seeing one row estimated from a nested loop seek is common and often not a sign of a problem.

However, the cardinality estimate for the outer part of the query is a bit off (36269 actual rows versus 6976 rows). It's perfectly natural to see a high number of logical reads with a nested loop and to suspect that part of the query is slow and needs to be improved. I find it useful to try to think about what the query optimizer should do instead to get the data that it needs. Would a hash join be better? Merge join? A nested loop with a different index?

I don't have the full picture but the nested loop joins that you called out don't look that bad to me. I don't see any key lookups and one of the indexes is covering. One way to move forward is to materialize all of the results of the query up until that point. Gather statistics on the temp table. Then look at a query plan for the adjusted query and see how long it takes to run. If the query plan changes for the better then you have a useful clue on how to make it run faster. If it doesn't change then you can at least get a more precise measurement of what you think the slow part is. Good luck!