Sql-server – Why does referencing a variable in a join predicate force nested loops

optimizationperformancequery-performancesql server

I came across this issue recently and couldn't find any discussion of it online.

The query below

DECLARE @S VARCHAR(1) = '';

WITH T
     AS (SELECT name + @S AS name2,
                *
         FROM   master..spt_values)
SELECT *
FROM   T T1
       INNER JOIN T T2
         ON T1.name2 = T2.name2;

Always gets a nested loops plan

enter image description here

Attempting to force the issue with INNER HASH JOIN or INNER MERGE JOIN hints produces the following error.

Query processor could not produce a query plan because of the hints
defined in this query. Resubmit the query without specifying any hints
and without using SET FORCEPLAN.

I found a workaround which does allow hash or merge joins to be used – wrapping the variable in an aggregate. The plan generated is significantly lower costed (19.2025 vs 0.261987)

DECLARE @S2 VARCHAR(1) = '';

WITH T
     AS (SELECT name + (SELECT MAX(@S2)) AS name2,
                *
         FROM   spt_values)
SELECT *
FROM   T T1
       INNER JOIN T T2
         ON T1.name2 = T2.name2; 

enter image description here

What is the reason for this behaviour? and is there a better workaround than the one that I found? (that perhaps doesn't require the extra execution plan branches)

Best Answer

I've tried your query on a SQL 2012 instance and trace flag 4199 seems to fix the issue. With it enabled I get a merge join for a total cost of 0.24 and none of the extra branches.

The specific KB article for this issue is Performance issues occur when the join predicate in your query has outer reference columns in SQL Server 2005 or in SQL Server 2008

enter image description here

To further qualify, TF 4199 enables all optimizer fixes. See this link for more information. Enabling everything at once can have weird side-effects, so if you can find a specific fix it might be better to enable the fix on it's own.

You can enable a trace flag on a per query basis using OPTION (QUERYTRACEON 4199);