Sql-server – TSQL Execution Plan – Estimated Number of Rows = 1 – Poor Performing Query

performanceperformance-tuningsql-server-2008-r2t-sqltuning

TSQL MSSQL 2008r2

I am trying to tune a poor performing query.

For 3 out of 4 of the largest tables (several millions of rows) being queried (index seeks), the plan has decided that the estimated number of rows = 1. I think this is where the issue is. The number of logical reads is crazy high which I think is as a result.

I don't know what is making the query engine assume this is the case. Statistics are updated and I'm not using any user defined functions. The query engine is not trying to "convert_implicit" any columns. So what do I need to look at, change to make this query more efficient.

The plan is here, https://www.brentozar.com/pastetheplan/?id=S18GOLFig

The variables are missing from Paste the Plan. Here they are:

DECLARE  @1stDay_StartExtract   DATE
        ,@LastDay_EndExtract    DATETIME
        ,@WhenProcessLastTime   DATETIME;

SET @1stDay_StartExtract= DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-3, 0); 
SET @LastDay_EndExtract = DATEADD(SECOND,-1,DATEADD(MONTH, DATEDIFF(MONTH,0,GETDATE())-1,0)); 

DECLARE  @dtStartDate_LT DATETIME= DATEADD( DAY, -1,@1stDay_StartExtract )
        ,@dtEndDate_LT   DATETIME= DATEADD( DAY, 1, @LastDay_EndExtract );

Let me know if you need more information.

Thank you in advance.

Best Answer

I know of three main ways of addressing a query performance issue caused by a cardinality mis-estimate:

1. Giving the optimizer more information

The query optimizer generally works better if it has higher quality information to inform the model. Steps here can include updating statistics, creating new statistics, using the RECOMPILE hint to pass along the literal values or variables, or materializing key intermediate result sets to provide better cardinality estimates or indexing.

Rewriting your query to be more clear to the optimizer

This can include simplifying code to remove redundant filters or refactoring it to be more clear to the optimizer. The query looks complex and we don't have the view code so it's hard to say more. There are a few filters in the query that appear to be extremely complex. It wouldn't surprise me at all that the optimizer cannot do a good job with guessing how those filters will affect the results.

3. Taking advantage of SQL Server enhancements

Sometimes there are features that you can turn that will make SQL Server do a better job with your workload. If you aren't using trace flag 4199 you could test this query with it. Trace flag 4199 is a collection of query optimization fixes that Microsoft has done over the years. It is on my default in SQL Server 2016. Trace flag 2301 is a bit less straightforward. It makes some changes to the optimizer around join cardinality estimate and in a rough sense you can say that the optimizer works harder to find a better plan. It is riskier and not nearly as common as trace flag 4199. Might not be practical but worth mentioning that each new version of SQL Server makes changes to improve query performance. In SQL Server 2014 there is a new cardinality estimator model which works better for some workloads.

For your particular query, I also want to note that it's easy to misread the single row estimate that you're seeing. The estimated number of rows that you see on the inner side of the nested loop is the number of rows returned per iteration of the loop. Seeing one row estimated from a nested loop seek is common and often not a sign of a problem.

However, the cardinality estimate for the outer part of the query is a bit off (36269 actual rows versus 6976 rows). It's perfectly natural to see a high number of logical reads with a nested loop and to suspect that part of the query is slow and needs to be improved. I find it useful to try to think about what the query optimizer should do instead to get the data that it needs. Would a hash join be better? Merge join? A nested loop with a different index?

I don't have the full picture but the nested loop joins that you called out don't look that bad to me. I don't see any key lookups and one of the indexes is covering. One way to move forward is to materialize all of the results of the query up until that point. Gather statistics on the temp table. Then look at a query plan for the adjusted query and see how long it takes to run. If the query plan changes for the better then you have a useful clue on how to make it run faster. If it doesn't change then you can at least get a more precise measurement of what you think the slow part is. Good luck!

Related Solutions

SQL Server Stream Aggregate for Sort – Performance Insights

To answer your first question:

Why is there a Stream Aggregate in the query when I dont have a group by. I am guess it has something to do with the join being merge and its doing a sort?

SQL Server is effectively rewriting your query. This query:

select Object1.Column1
FROM Object2 Object3 WITH (ROWLOCK)
INNER JOIN Object4 Object1 WITH (ROWLOCK) ON Object3.Column2 = Object1.Column1
WHERE Object3.Column2 in (SELECT Column2 FROM Object5)
or Object3.Column3 in (SELECT Column2 FROM Object5);

Can also be written like this:

select Object1.Column1
FROM
(
select Object3.Column2
FROM Object2 Object3 WITH (ROWLOCK)
WHERE Object3.Column2 in (SELECT Column2 FROM Object5)

UNION

select Object3.Column2
FROM Object2 Object3 WITH (ROWLOCK)
WHERE Object3.Column3 in (SELECT Column2 FROM Object5)
) Object3 
INNER JOIN Object4 Object1 WITH (ROWLOCK) ON Object3.Column2 = Object1.Column1;

That's why you have two different access methods on Object2 in the query plan. The Merge Join (Concatenation) operation isn't actually a merge join. It's just implementing a UNION ALL and combining the results. The stream aggregate that you mentioned groups by Object3.Column2. This both removes duplicates from the Merge Join (Concatenation) and sorts the data so that it can be used in the following MERGE JOIN to Object4.

To answer your second question:

Secondly and more importantly why does an estimate of 13 going into the stream aggregate comes out as an estimate of 24,595,900? This is causing the secondary problem of Object4 getting Clustered index scanned instead of a nested loop. I had to split the query into two queries instead of using a OR and the join turns into a nested loop and seeks into Object4.

It looks like a bug. After some searching I found an article by Paul White explaining the issue. The bug is reported to Connect if you want to vote for it or add yourself as affected.

In short, the cardinality estimate is 24595900 because 27328800 (table cardinality of Object2) * 0.9 = 24595900 after rounding. You're on SQL Server 2008 so the legacy CE calculation is used:

A similar issue occurs with the pre-2014 cardinality estimator, but the final estimate is instead fixed at 90% of the estimated semi join input (for entertaining reasons related to a inversed fixed 10% predicate estimate that is too much of a diversion to get into).

I recommend reading through the entire article. The issue is a little difficult to summarize, but I will attempt to do so by quoting part of the text near the end:

The cardinality estimator uses the Fixed Join calculator with 100% selectivity. As a consequence, the estimated output cardinality of the semi join is the same as its input, meaning all 113443 rows from the history table are expected to qualify.

The exact nature of the bug is that the semi join selectivity computation misses any predicates positioned beyond a union all in the input tree. In the illustration below, the lack of predicates on the semi join itself is taken to mean every row will qualify; it ignores the effect of predicates below the concatenation (union all).

A similar issue occurs with the pre-2014 cardinality estimator, but the final estimate is instead fixed at 90% of the estimated semi join input (for entertaining reasons related to a inversed fixed 10% predicate estimate that is too much of a diversion to get into).

Splitting up the query as you did should prevent the issue. You also might have some luck by rewriting the query using UNION (see Example 4 in the article).

Sql-server – How to improve the execution plan to speed up the query

First let's start where the problem likely isn't to be. Cardinality estimates look fine. You aren't doing any filtering and all of the joins are left joins on the primary keys of the tables. Table access methods look fine. You've defined the best possible indexes for the tables and SQL Server is doing index scans on those tables. Seems perfectly reasonable because you need all of the rows from the tables. Join order seems ok. None of the tables will decrease or increase the size of the result set (if ZZ_PlayerIDList is the starting table). A poor join order could lead to some unnecessary repartitioning but I don't see that here.

That means the join type. There are four parallel merge joins and the rest are parallel hash joins. There are some edge cases with parallel merge join that perform very poorly. I'm not a fan of it for this type of query in which you data is already sorted. The parallel aspect of it means you read the sorted data in parallel (which breaks the sorting) only to have to sort it again. There's a lot of repartitioning rows and sorting which would be avoided with MAXDOP 1 merge joins. There are also hints in the query plan that this query isn't getting as much memory as it wants.

I would try adding a MAXDOP 1 hint to the query and making changes so that you get all merge joins. For testing purposes it may be helpful to add OPTION (MERGE JOIN) as a hint. I would not expect that query to take more than 9 hours. If it does that sounds like some kind of hardware, configuration, or blocking issue.

From a data model point of view, you could improve performance of the query by combining together some of your tables. All of them have the same primary keys. For example, UX_ZZGamingTrips has 18260100 rows and UX_ZZTrackedSlot has 18259900 rows. Would combing those tables into one table really be a bad thing? What do you gain by separating them?

Best Answer

Related Solutions

SQL Server Stream Aggregate for Sort – Performance Insights

Sql-server – How to improve the execution plan to speed up the query

Related Question