I came across this issue recently and couldn't find any discussion of it online.
The query below
DECLARE @S VARCHAR(1) = '';
WITH T
AS (SELECT name + @S AS name2,
*
FROM master..spt_values)
SELECT *
FROM T T1
INNER JOIN T T2
ON T1.name2 = T2.name2;
Always gets a nested loops plan
Attempting to force the issue with INNER HASH JOIN
or INNER MERGE JOIN
hints produces the following error.
Query processor could not produce a query plan because of the hints
defined in this query. Resubmit the query without specifying any hints
and without using SET FORCEPLAN.
I found a workaround which does allow hash or merge joins to be used – wrapping the variable in an aggregate. The plan generated is significantly lower costed (19.2025 vs 0.261987)
DECLARE @S2 VARCHAR(1) = '';
WITH T
AS (SELECT name + (SELECT MAX(@S2)) AS name2,
*
FROM spt_values)
SELECT *
FROM T T1
INNER JOIN T T2
ON T1.name2 = T2.name2;
What is the reason for this behaviour? and is there a better workaround than the one that I found? (that perhaps doesn't require the extra execution plan branches)
Best Answer
I've tried your query on a SQL 2012 instance and trace flag 4199 seems to fix the issue. With it enabled I get a merge join for a total cost of 0.24 and none of the extra branches.
The specific KB article for this issue is Performance issues occur when the join predicate in your query has outer reference columns in SQL Server 2005 or in SQL Server 2008
To further qualify, TF 4199 enables all optimizer fixes. See this link for more information. Enabling everything at once can have weird side-effects, so if you can find a specific fix it might be better to enable the fix on it's own.
You can enable a trace flag on a per query basis using
OPTION (QUERYTRACEON 4199)
;