Based on my experience with such issue and as mentioned in This Blogs.msdn article
You need to apply SP1 but you must also enable trace flag 4199 in order to activate the fix.SQL Server 2014 Service Pack 1 made various fixes on new Cardinality Estimator (new CE). The release notes also documents the fixes.
So I suggest to suppress regression you enable it.
Trace flag 9481
solved your problem because if forced SQL Server to use old optimizer. So basically you are using SQL Server 2014 but not new CE which comes with it.
Trace flag 4199
makes sure the optimizer is using all changes/fixes made to SQL Server optimizer since SQL Server 2005.
From this support article
Trace flag 4199 was used to collect hotfixes that were intended to become on-by-default in a future release, whereas other trace flags were used for situations in which a fix was not intended to become on-by-default in current form. Starting with SQL Server 2016 RTM, the database COMPATIBILITY_LEVEL setting will be used enable trace flag 4199-related hotfixes on-by-default. This article describes the mechanics and policy of how plan-affecting hotfixes will be delivered for SQL Server 2016 and later versions.
There are three ways to do it depending on need.
By enabling the trace flag in a batch (by using DBCC TRACEON command) right before the target query, and then disabling the trace flag (by using DBCC TRACEOFF command) right after the query.
Starting with Microsoft SQL Server 2005 Service Pack 2 (SP2) and Microsoft SQL Server 2008, the query-level option "QUERYTRACEON" is available. This option lets you to enable a plan-affecting trace flag only during single-query compilation.
You enable it in starup parameters.
The first and second point will only make sure query/batch runs with 4199 trace flag while third point makes sure any query which runs sees this trace flag enabled.
NOTE: Enabling trace flag requires sysadmin permission so if you ask your developer to use it in query for instance he might not be able to assuming developers have limited access.
I would suggest you to use querytraceon
or dbcc traceon
to see if queries are performing up to mark.
The question of why one cardinality estimation model produces closer results than the other in this case is actually not that interesting. The original CE estimates that not finding a matching row has a very small probability; the new CE calculates that it is almost certain. Both are 'correct', just based on different modelling assumptions. Fundamentally, multi-column semi joins are tricky to evaluate based on single-column statistical information.
It is much more interesting to think about what the query is trying to do, and how we can write it in a way that is more compatible with the statistical information available to SQL Server.
A key observation is that the query will return row(s) with one value per group. In the case of the original query, that is row(s) with the minimum HistoryId
value for each Transactionid
. In the repro, it is row(s) with the minimum c1
value for each different value of c2
. The NOT EXISTS
query is just one way of expressing that requirement.
SQL Server has good statistical information about distinct values (density) so all we need to do is write the query in a way that makes it clear we want one value per group. There are many ways to do this, for example (using your repro):
SELECT *
FROM dbo.nat AS N
WHERE N.c1 =
(
SELECT MIN(N2.c1)
FROM dbo.nat AS N2
WHERE N2.c2 = N.c2
);
or, equivalently:
SELECT N.*
FROM dbo.nat AS N
JOIN
(
SELECT
N.c2,
MIN(N.c1) AS c1
FROM dbo.nat AS N
GROUP BY
N.c2
) AS J
ON J.c2 = N.c2
AND J.c1 = N.c1;
This produces an exactly correct estimate of 9999 rows in 2008 R2, 2012, and 2014 (both CE models):
With a natural index (which would probably be unique as well):
CREATE INDEX i ON dbo.nat (c2, c1);
The plan is even simpler:
You may not always be able to get this very simple plan form, depending on indexes, and other factors. The point I am making is that using basic grouping and joining operations often gets better results from the optimizer (and its cardinality estimation component) than more complex alternatives.
Final notes to clear some misconceptions in the question: the 'new CE' was introduced in 2014. TF 4199 enables plan-affecting optimizer fixes. TF 9481 specifies the original ('legacy') CE, and is only effective on 2014 and later versions.
Best Answer
I understand your disappointment with the query plan regressions that you experienced. However, Microsoft changed some core assumptions about the cardinality estimator. They could not avoid some query plan regression. To quote Juergen Thomas:
To answer your first question, the optimizer appears to pick a worse plan with the new CE because of a 1 row cardinality estimate from
Object2
. This makes a nested loop join very attractive to the optimizer. However, the actual number of rows returned fromObject2
was 34182. This means that the estimated cost for the nested loop plan was an underestimate by about 30000X.The legacy CE gives a 208.733 cardinality estimate from
Object2
. This is still very far off, but it's enough to give a plan that uses a merge join a lower estimated cost than a nested loop join plan. SQL Server gave the nonclustered index seek onObject3
a cost of 0.0032831. With a nested loop plan under the legacy CE, we could expect a total cost for 208 index seeks to be about 0.0032831 * 208.733 = 0.68529 which is much higher than the final estimated subtree cost for the merge join plan, 0.0171922.To answer your second question, the cardinality estimate formulas for a query as simple as yours are actually published by Microsoft. I recommend referencing the excellent white paper on differences between the legacy and new CE found here. Focus on why the cardinality estimates are 1 for the new CE and 208.733 for the legacy CE. That's unexpected because the legacy CE assumes independence of filters but the new CE uses exponential backoff. In general for such a query I would expect the new CE to give a larger cardinality estimate for
Object2
. You should be able to figure what's going on by looking at the statistics onObject2
.To answer your third question, we can get general strategies from the white paper. The following is an abbreviated quote:
For your problem in particular, first I would focus on the statistics. It's not clear to me why the cardinality estimate for an index scan on
Object3
would be so far off. I recommend updating statistics withFULLSCAN
on all of the involved objects and indexes before doing more tests. Updating the statistics again after changing the CE is also a good step. You should be able to use the white paper to figure out exactly why you're seeing the cardinality estimates that you're seeing.I can give you more detailed help if you provide more information. I understand wanting to protect your IP but what you have there is a pretty simple query. Can you change the table and column names and provide the exact query text, relevant table DDL, index DDL, and information about the statistics?
If all else fails and you need to fix it without hints or trace flags, you could try updating to SQL Server 2016 or changing the indexes on your tables. You're unlikely to get the bad nested loop plan if you remove the index, but of course removing an index could affect other queries in a negative way.