SQL Server Performance – Row Estimates Change with Join Hint

optimizationperformanceperformance-tuningsql servert-sql

I have a query that joins a few tables and performs pretty badly – row estimates are way (a 1000 times) off and Nested Loops join is chosen, resulting in multiple table scans. The shape of the query is fairly straightforward, looking something like this:

SELECT t1.id
FROM t1
INNER JOIN t2 ON t1.id = t2.t1_id
LEFT OUTER JOIN t3 ON t2.id = t3.t2_id
LEFT OUTER JOIN t4 ON t3.t4_id = t4.id 
WHERE t4.id = some_GUID

Playing around with the query, I noticed that when I hint it to use a Merge join for one of the joins, it runs many times faster. This I can understand – Merge join is a better option for the data that is joined, but SQL Server just doesn't estimate it right choosing the Nested Loops.

What I don't fully understand is why does this join hint changes all the estimates for all the plan operators? From reading different articles and books, I assumed that the cardinality estimations are performed before the plan is built, so using a hint would not have changed the estimations, but rather explicitly tell SQL Server to use a particular physical join implementation.

What I see, however, is that Merge hint causes all estimations to become pretty much perfect. Why does this happen and are there any common techniques to make query optimizer make a better estimate without a hint – considering that statistics obviously allow for this?

UPD:
anonymized execution plans can be found here:
https://www.dropbox.com/s/hchfuru35qqj89s/merge_join.sqlplan?dl=0 https://www.dropbox.com/s/38sjtv0t7vjjfdp/no_hints_join.sqlplan?dl=0

I checked the stats used by both queries using TF 3604, 9292 and 9204, and those are identical. However indexes which are scanned/seeked differ between the queries.

Besides that, I tried running the query with OPTION (FORCE ORDER) – it runs even faster than using merge join, choosing HASH MATCH for every join.

Best Answer

From reading different articles and books, I assumed that the cardinality estimations are performed before the plan is built.

Not exactly. An initial cardinality estimate is derived (after simplifications and other work), which influences the initial join order chosen by the optimizer.

However, subsequent explorations (during cost-based optimization) can, and often do, result in new cardinality estimations being calculated. These later CEs may be more or less 'accurate'. If an under-estimate results, the optimizer may choose a plan that looks cheaper, but in fact runs for much longer.

In general, there is no guarantee that cardinality estimations for semantically identical subtrees will produce the same results. It is a statistical process, after all, and some operations have deeper CE support than others.

In your case, there appears to be another factor - the optimizer introduces (or moves around) a Top, which sets a row goal on the subtree beneath it:

Plan fragment

If you were to enable trace flag 4138 (on 2008 R2 or later), you may find the estimates more in-line with expectations, or perhaps even that the optimizer would no longer choose nested loops.

What I see, however, is that Merge hint causes all estimations to become pretty much perfect.

There is an element of luck involved here. People tend to write queries, or at least the joins, in the order they expect them to be performed physically. Using a join hint comes with an implied FORCE ORDER, thereby fixing the join order to match the textual form, and switching off many optimizer exploration rules that can lead to cardinality re-estimation.

Besides that, I tried running the query with OPTION (FORCE ORDER) - it runs even faster than using merge join, choosing HASH MATCH for every join.

This is the same as hinting a join, but does not restrict the choice of physical join operator. Again, if you happened to write the query join order in a logical way, it is quite likely you will get a reasonable plan. Of course, you miss out on much of the optimizer's abilities this way, which may not produce optimal results in more general situations.

You will probably not want to use FORCE ORDER very often because it is an extremely powerful hint (directive) that has wider effects than simple forcing the order of joins; for example, it prevents the optimizer moving aggregates around and introducing partial aggregations. I very much advise against using this hint except in very exceptional circumstances, and by truly expert tuners.

A detailed analysis would require more time than I have right now, and access to a statistics-only copy of the database.