Why does the inner join to a one record temp table make the query take so much longer time?
Without the join, the optimizer is smart enough to work out that it can find the minimum value by reading one row from the end of the index.
Unfortunately, it is not currently equipped to apply the same sort of logic when the query is more complicated (with a join or grouping clause, for example). To work around this limitation, you can rewrite the query to compute local minimums per row in the temporary table, then find the global minimum.
Perhaps the easiest way to express this in T-SQL is to use the APPLY
operator:
SELECT
-- Global minimum
@tenor_from = MIN(MinMaturityPerCurveID.maturity_date)
FROM #source_price_curve_list AS SPCL
CROSS APPLY
(
-- Minimum maturity_date per price_curve_id
SELECT TOP (1)
SPC.maturity_date
FROM dbo.source_price_curve AS SPC
WHERE
SPC.source_curve_def_id = SPCL.price_curve_id
and as_of_date >= @as_of_date_from
ORDER BY
SPC.maturity_date ASC
) AS MinMaturityPerCurveID;
Good performance relies on there being many rows per price_curve_id
. You may need an index of the form:
CREATE NONCLUSTERED INDEX
[IX dbo.source_price_curve source_curve_def_id, maturity_date, as_of_date]
ON dbo.source_price_curve
(
source_curve_def_id,
maturity_date,
as_of_date
);
(summarizing my comments and putting as answer)
A query rewrite will solve the issue of getting low row estimates. As Joe Chang explains in his blog post Query Optimizer Gone Wild - Full-Text
CONTAINS is "a predicte used in a WHERE clause" per Microsoft documentation, while CONTAINSTABLE acts as a table.
You get a much better plan (merge join) using CONTAINSTABLE
vs the actual plan using contains
uses a nested loop join with low row estimates.
You can rewrite the query as :
SELECT TOP 30 p.PersonId,
p.PersonParentId,
p.PersonName,
p.PersonPostCode
FROM dbo.People p
left join containstable (ContactFullText, '"mr" AND "ch*"') cf on cf.[yourKey] = p.PersonId
WHERE p.PersonDeletionDate IS NULL
AND p.PersonCustomerId = 24
--AND CONTAINS(ContactFullText, '"mr" AND "ch*"')
AND p.PersonGroupId IN(197, 206, 186, 198)
AND [RANK] > 0
ORDER BY p.PersonParentId,
p.PersonName;
Best Answer
To answer your first question:
SQL Server is effectively rewriting your query. This query:
Can also be written like this:
That's why you have two different access methods on
Object2
in the query plan. TheMerge Join (Concatenation)
operation isn't actually a merge join. It's just implementing aUNION ALL
and combining the results. The stream aggregate that you mentioned groups byObject3.Column2
. This both removes duplicates from theMerge Join (Concatenation)
and sorts the data so that it can be used in the followingMERGE JOIN
to Object4.To answer your second question:
It looks like a bug. After some searching I found an article by Paul White explaining the issue. The bug is reported to Connect if you want to vote for it or add yourself as affected.
In short, the cardinality estimate is 24595900 because 27328800 (table cardinality of
Object2
) * 0.9 = 24595900 after rounding. You're on SQL Server 2008 so the legacy CE calculation is used:I recommend reading through the entire article. The issue is a little difficult to summarize, but I will attempt to do so by quoting part of the text near the end:
Splitting up the query as you did should prevent the issue. You also might have some luck by rewriting the query using
UNION
(see Example 4 in the article).