SQL Server – Understanding Constant Scan Spooling

execution-plansql server

I have a table with a few dozens of rows. Simplified setup is following

CREATE TABLE #data ([Id] int, [Status] int);

INSERT INTO #data
VALUES (100, 1), (101, 2), (102, 3), (103, 2);

And I have a query that joins this table to a set of table value constructed rows (made of variables and constants), like

DECLARE @id1 int = 101, @id2 int = 105;

SELECT
    COALESCE(p.[Code], 'X') AS [Code],
    COALESCE(d.[Status], 0) AS [Status]
FROM (VALUES
        (@id1, 'A'),
        (@id2, 'B')
    ) p([Id], [Code])
    FULL JOIN #data d ON d.[Id] = p.[Id];

Query execution plan is showing that optimizer's decision is to use FULL LOOP JOIN strategy, which seems appropriate, since both inputs have very few rows. One thing I noticed (and cannot agree), though, is that TVC rows are being spooled (see area of the execution plan in the red box).

Constant Scan spooling

Why optimizer introduces spool here, what is the reason to do it? There is nothing complex beyond the spool. Looks like it is not necessary. How to get rid of it in this case, what are the possible ways?


The above plan was obtained on

Microsoft SQL Server 2014 (SP2-CU11) (KB4077063) – 12.0.5579.0 (X64)

Best Answer

Why optimizer introduces spool here, what is the reason to do it? There is nothing complex beyond the spool.

The thing beyond the spool is not a simple table reference, which could simply be duplicated when the left join / anti semi join alternative is generated.

It may look a little like a table (Constant Scan) but to the optimizer* it is a UNION ALL of the separate rows in the VALUES clause.

The additional complexity is enough for the optimizer to choose to spool and replay the source rows, and not replace the spool with a simple "table get" later on. For example, the initial transformation from full join looks like this:

early plan

Notice the extra spools introduced by the general transform. Spools above a simple table get are cleaned up later by the rule SpoolGetToGet.

If the optimizer had a corresponding SpoolConstGetToConstGet rule, it could work as you wish, in principle.

How to get rid of it in this case, what are the possible ways?

Use a real table (temporary or variable), or write the transformation from full join manually, for example:

WITH 
    p([Id], [Code]) AS
    (
        SELECT @id1, 'A'
        UNION ALL
        SELECT @id2, 'B'
    ),
    FullJoin AS
    (
        SELECT
            p.Code,
            d.[Status]
        FROM p
        LEFT JOIN #data d 
            ON d.[Id] = p.[Id]
        UNION ALL
        SELECT
            NULL,
            D.[Status]
        FROM #data AS D
        WHERE NOT EXISTS
        (
            SELECT *
            FROM p
            WHERE p.Id = D.Id
        )
    )
SELECT
    COALESCE(FullJoin.Code, 'X') AS Code,
    COALESCE(FullJoin.Status, 0) AS [Status]
FROM FullJoin;

Plan for manual rewrite:

Manual rewrite plan

This has an estimated cost of 0.0067201 units, compared with 0.0203412 units for the original.


* It can be observed as a LogOp_UnionAll in the Converted Tree (TF 8605). In the Input Tree (TF 8606) it is a LogOp_ConstTableGet. The Converted Tree shows the tree of optimizer expression elements after parsing, normalization, algebrization, binding, and some other preparatory work. The Input Tree shows the elements after conversion to Negation Normal Form (NNF convert), runtime constant collapsing, and a few other bits and bobs. NNF convert includes logic to collapse logical unions and common table gets, among other things.