Sql-server – Why Does the SQL Server Query Optimizer Not Convert an OUTER JOIN to an INNER JOIN When a Trusted Foreign Key Is Present

optimizationperformancequery-performancesql server

It appears that the SQL Server query optimizer does not translate an OUTER JOIN into an INNER JOIN when the join column in the first table is defined as NOT NULL and has a trusted foreign key constraint to the corresponding column in the second table.

It seems that in this scenario, the OUTER JOIN could be translated into an equivalent INNER JOIN, because each row in the first table is:

  1. guaranteed to have a value in the column (NOT NULL constraint), and
  2. guaranteed to have a matching row in the second table (trusted foreign key constraint).

For example, consider the following tables:

 CREATE TABLE dbo.tbl_fk
 (
    fk_val CHAR(1) NOT NULL PRIMARY KEY CLUSTERED,
    junk VARCHAR(100) NOT NULL
 );
 
 CREATE TABLE dbo.tbl_main
 (
    id INT NOT NULL PRIMARY KEY CLUSTERED IDENTITY(1,1),
    fk_val CHAR(1) NOT NULL FOREIGN KEY REFERENCES dbo.tbl_fk(fk_val)
 );

In the following query, why does the optimizer not convert the LEFT OUTER JOIN to an INNER JOIN in the execution plan?

SELECT m.fk_val, f.junk
FROM dbo.tbl_main AS m
LEFT OUTER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val;

By adding a predicate to explicitly remove NULL values from the second table, however, the optimizer converts the LEFT OUTER JOIN into an INNER JOIN in the query plan, as expected:

SELECT m.fk_val, f.junk
FROM dbo.tbl_main AS m
LEFT OUTER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val
WHERE f.fk_val IS NOT NULL;

Undocumented trace flags show that a rule is applied to change the OUTER JOIN to an INNER JOIN in the query with the predicate:

SELECT m.fk_val, f.junk
FROM dbo.tbl_main AS m
LEFT OUTER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val
WHERE f.fk_val IS NOT NULL
OPTION (QUERYTRACEON 3604, QUERYTRACEON 8621);

***** Rule applied: A LOJ B -> A JN B

These examples were tested using SQL Server 2014 and 2019 using a number of different compatibility levels and also using both the legacy and "new" cardinality estimator. The behavior appeared to be the same in all cases.

Forrest McDaniel pointed out that (as of 2010), Sybase appeared to have such a transformation built into their product (see Example 1): http://dcx.sybase.com/1200/en/dbusage/queryopt-sectb-5356466.html

Best Answer

I'm not sure there's a "good reason" why the query transformation doesn't happen, other than it hasn't been implemented.

You can see a potential benefit in queries like

SELECT m.fk_val, count(f.junk)
FROM dbo.tbl_main AS m
LEFT OUTER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val
group by m.fk_val

Which can have the aggregate pushed before the join if it's an INNER JOIN.

SELECT m.fk_val, count(f.junk)
FROM dbo.tbl_main AS m
INNER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val
group by m.fk_val

But in the case of the conversion of OUTER JOIN + Predicate on outer column, the outer column predicate is either eliminated entirely, or can be pushed to before the join.

EG, with

SELECT m.fk_val, f.junk
FROM dbo.tbl_main AS m
LEFT OUTER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val
WHERE f.fk_val IS NOT NULL

This query is logically equivalent to the simpler query

SELECT m.fk_val, f.junk
FROM dbo.tbl_main AS m
INNER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val

which has one less predicate.

And if you try to push the outer table predicate before the join without changing it to an INNER JOIN you would still have to apply the predicate after the join.

For something like

SELECT m.fk_val, m.Id
FROM dbo.tbl_main AS m
LEFT OUTER JOIN dbo.tbl_fk AS f
    ON m.fk_val = f.fk_val
WHERE f.junk = 'a';

If you apply the f.junk = 'a' predicate before the join, you'll filter out some of the rows before the JOIN. But then the inner rows corresponding to outer rows with f.junk <> 'a' would survive the OUTER JOIN, and you'd have to apply f.junk = 'a' again after the join.