Sql-server – Why can’t unused columns and tables be ignored in a recursive CTE

cterecursivesql server

Usually, SQL Server can optimize away any unused columns in the execution plan, and unused joined tables are not queried. But as soon as a recursive CTE comes into play, apparently all columns and joined tables are queried, even if not necessary. Why is that? Can I help SQL Server to ignore unused columns/tables here?

Example:

;WITH cte AS
(
    SELECT cn.CatalogNodeId,
           cn.ParentId,
           cn.CatalogNodeType,
           (SELECT Name FROM dbo.LocalizedCatalogNodes WHERE CatalogNodeId = cn.CatalogNodeId) AS Name
    FROM dbo.CatalogNodes AS cn
    WHERE cn.ParentId IS NULL
    UNION ALL
    SELECT cn.CatalogNodeId,
           cn.ParentId,
           cn.CatalogNodeType,
           (SELECT Name FROM dbo.LocalizedCatalogNodes WHERE CatalogNodeId = cn.CatalogNodeId) AS Name
    FROM dbo.CatalogNodes AS cn
    JOIN cte ON cte.CatalogNodeId = cn.ParentId
)
SELECT CatalogNodeId FROM cte

This is a simple recursive CTE resolving a hierarchical parent-child structure, adding a localized name to each node.

Even though not necessary, the execution plan shows that column CatalogNodeType is retrieved and dbo.LocalizedCatalogNodes is joined for each iteration.

Now comment out the UNION ALL part making it a non-recursive CTE, and the execution plan suddenly consists of just two steps, not containing column CatalogNodeType or table dbo.LocalizedCatalogNodes.

Best Answer

It's just a matter of order of operations, which is dependent on what the query optimizer thinks is going to be the most efficient execution plan. Yes, we can obviously see that in the end, the final SELECT list only consists of the CatalogNodeId column, and therefore the other columns of the CTE don't matter, so Microsoft probably could've implemented recursive CTEs to be a little bit better with their available execution plans in this regard. But even easier it is for the developer to recognize the lack of necessity for those columns in the recursive CTE, and therefore can be removed from the query, which is generally best practice anyway.