SQL Server – Rules for Using CTE as an Optimization Fence

cteoptimizationperformancequery-performancesql server

A while back, Brent Ozar published a post detailing some of the differences between the SQL Server and PostgreSQL:

Two Important Differences Between SQL Server and PostgreSQL

The first point (“CTEs are optimization fences”) caught my eye, because it is obvious that in the example provided, SQL Server combines the CTE and the main query together and optimizes it as a single query (as opposed to the opposite behavior in PostgreSQL).

However, this behavior seems contrary to the examples that I have seen in other blogs and training classes, where SQL Server does treat the CTE as an optimization fence, which allows for better use of indexes, better performance, etc. For example:

A Better Way To Select Star

So, it seems like SQL Server “honors” the CTE as an optimization fence SOMETIMES. Are there any good resources available that document the specific list of known cases where SQL Server will reliably honor the CTE as an optimization fence (or the opposite behavior)?

Best Answer

...list of known cases where SQL Server will reliably honor the CTE as an optimization fence

Any such list would rely on observed behaviour, with no guarantee of reliability.

The SQL Server query optimizer never treats a common table expression as an optimization fence per se, though some constructions are clearly difficult to optimize across. Recursive CTEs are a good example of this.

CTEs are treated very similarly to views/inline functions/subqueries/derived tables and inlined into the query. Any observed 'fence' behaviour depends on the optimizer either not being able to, or deciding not to, optimize across that in-principle permeable border.

Generally speaking, the simpler and more 'relational' the CTE is, the more likely it is that the optimizer will be able to move bits around.

Features that would allow the optimizer to consider, or force it to materialize the 'result' of a CTE have been suggested, but not yet implemented:

In the meantime, the most common workaround is to explicitly materialize the intermediate result set in a temporary table or table variable. This obviously requires a scenario not limited to a single statement.