SQL Server 2017 – How Interleaved Execution Works

functionsinterleaved-executionsql serversql-server-2017

Interleaved Execution is part of a family of features in the 2017 query processor that consists of:

So how does Interleaved Execution work?

Best Answer

The Devil Is In The Table Variable

Interleaved execution is aimed at correct cardinality misestimated in Multi-Statement Table Valued Functions.

In prior versions of SQL Server, these functions would always produced rather shoddy estimates:

2014, 2016: 100 rows
2005 - 2012: 1 row

Needless to say, this could cause a lot of problems when joining to other tables.

While selecting data from a table variable does not inhibit parallelism on its own, the low row estimates would often contribute to low query costs, where parallelism wouldn't be considered.

With Interleaved Execution, cardinality estimation is paused, the subtree for the MSTVF is executed, and optimzation is resumed with a more accurate cardinality estimate.

How do I know if my MSTVF receives interleaved execution.

Like with Adaptive Joins, Interleaved Execution is noted in the query plan. Unlike Adaptive Joins, it is not noted in estimated plans, at least as of this writing.

The plan shape for a MSTVF with Interleaved Execution is a bit different from a typical plan that has a MSTVF in it.

You'll see the Table Valued Function operator at the top of the plan, and the scan of the Table Variable where the TVF operator would normally be in the graphical plan.

When hovering over the TVF operator, you'll see the attribute IsInterleavedExecuted set to True, as well as a an estimated number of rows that may very nearly reflect reality. Hurrah.

Are there any Extended Events to troubleshoot when Interleaved Execution doesn't occur?

Yes, a whole bunch:

Note that some of these are in the Debug channel, which isn't selected by default when searching for Events to Extend.

Does Interleaved Exection Require ColumnStore Indexes?

No, they'll work either way. Here's an example:

SELECT u.Id, mj.*
FROM   dbo.Users_cx AS u --ColumnStore
JOIN   dbo.MultiStatementTVF_Join(0) AS mj
ON mj.UserId = u.Id
WHERE  u.LastAccessDate >= '2016-12-01';


SELECT u.Id, mj.*
FROM   dbo.Users AS u --RowStore
JOIN   dbo.MultiStatementTVF_Join(0) AS mj
ON mj.UserId = u.Id
WHERE  u.LastAccessDate >= '2016-12-01';

These queries each join to different tables. One ColumnStore, one not. They both get Interleaved Execution plans.

When does Interleaved Execution work?

Right now, it only works with MSTVFs where the correlation is done outside of the function.

Here are a couple examples:

This function has no inner correlation, meaning there's no WHERE clause predicated on a table column and a passed in variable.

CREATE OR ALTER FUNCTION dbo.MultiStatementTVF_Join
(
    @h BIGINT
)
RETURNS @Out TABLE
(
    UserId INT,
    BadgeCount BIGINT
)
AS
    BEGIN
        INSERT INTO @Out ( UserId, BadgeCount )
        SELECT   b.UserId, COUNT_BIG(*) AS BadgeCount
        FROM     dbo.Badges AS b
        GROUP BY b.UserId
        HAVING   COUNT_BIG(*) > @h;
        RETURN;
    END;
GO

This function is the opposite, with a predicate on the UserId column with a passed in variable.

CREATE OR ALTER FUNCTION dbo.MultiStatementTVF_CrossApply
(
    @h BIGINT,
    @id INT
)
RETURNS @Out TABLE
(
    UserId INT,
    BadgeCount BIGINT
)
AS
    BEGIN
        INSERT INTO @Out ( UserId, BadgeCount )
        SELECT   b.UserId, COUNT_BIG(*) AS BadgeCount
        FROM     dbo.Badges AS b
        WHERE    b.UserId = @id
        GROUP BY b.UserId
        HAVING   COUNT_BIG(*) > @h;
        RETURN;
    END;
GO

It's a common misconception that CROSS APPLY won't work. The real limitation is noted previously. The inner-function correlation is the deal breaker.

SELECT u.Id, mj.*
FROM   dbo.Users AS u --RowStore
CROSS APPLY dbo.MultiStatementTVF_Join(0) AS mj
WHERE  mj.UserId = u.Id
       AND u.LastAccessDate >= '2016-12-01';


SELECT   TOP 1 u.Id, mj.*
FROM     dbo.Users AS u --RowStore
CROSS APPLY dbo.MultiStatementTVF_CrossApply(2147483647, u.Id) AS mj
WHERE    u.LastAccessDate >= '2016-12-01'
ORDER BY u.Id;

Related Solutions

SQL Server 2017 – How Batch Mode Adaptive Joins Work

Batch Mode Adaptive Joins

For Batch Mode Adaptive Joins, the goal is to not pin join choice to a specific type at compile time.

When available, Adaptive Joins allow the optimizer to choose between Nested Loops Joins and Hash Joins based on row thresholds at run time.

At this time, Merge Joins are not considered. Pure speculation is that needing data to be sorted, or needing to inject a sort into the plan would add too much overhead when changing the course of a query.

When Do Batch Mode Adaptive Joins Occur?

At this time, Batch Mode query processing requires the presence of a ColumnStore index. They also require, well, a join, and an index that allows for the choice of a Nested Loops or Hash Join.

How do I know if my Join is Adaptive?

Query plans for adaptive joins are quite distinctive.

The Adaptive Join operator is new to SQL Server 2017, and currently has the following properties in actual execution plans.

Physical Operation: Adaptive Join
Actual Join Type: Will be Hash match or Nested Loops
Adaptive Threshold Rows: Signifies the tipping point when the join type will switch to Hash Match
Is Adaptive: True for Adaptive Joins
Estimated Join Type: Rather self-explanatory!

In an estimated or cached plan, there is rather less information:

Most notably, the Actual Join Type is missing.

What breaks Batch Mode Adaptive Joins?

To monitor this, there's an Extended Event session called adaptive_join_skipped, and it has the following reasons for skipping a Batch Mode Adaptive Join:

eajsrExchangeTypeNotSupported
eajsrHJorNLJNotFound
eajsrInvalidAdaptiveThreshold
eajsrMultiConsumerSpool
eajsrOuterCardMaxOne
eajsrOuterSideParallelMarked
eajsrUnMatchedOuter

Aside from those, Batch Mode Adaptive Joins may be skipped for other reasons. Take these two queries for example:

/*Selecting just integer data*/
SELECT uc.Id, uc.Reputation, p.Score
FROM   dbo.Users_cx AS uc
JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate >= '20160101';

/*Selecting one string column from Users*/
SELECT uc.Id, uc.DisplayName, uc.Reputation, p.Score
FROM   dbo.Users_cx AS uc
JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate >= '20160101';

They're identical except that the second query selects the DisplayName column, which has a type of NVARCHAR(40).

The Batch Mode Adaptive Join is skipped for the second query, but no reason is logged to the XE session. It appears that string data remains the steadfast enemy of ColumnStore indexes.

There are other query patterns that fail to get Adaptive Joins, that also do not trigger events.

Some examples:

CROSS APPLY with a TOP
OUTER APPLY

eajsrExchangeTypeNotSupported

One thing that will trigger this event appears to be the presence of a Repartition Streams operator. In this query, the partitioning type is Hash Match. Special thanks to the Intergalactic Celestial Being masquerading as a humble blogging man known as Paul White for the bizarre query.

SELECT uc.Id, uc.Reputation, CONVERT(XML, CONVERT(NVARCHAR(10), p.Score)).value('(xml/text())[1]', 'INT') AS [Surprise!]
FROM   dbo.Users_cx AS uc
JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate <= '2009-08-01'
OPTION ( USE HINT ( 'ENABLE_PARALLEL_PLAN_PREFERENCE' ));
GO

eajsrHJorNLJNotFound

No queries have triggered this XE yet. What doesn't work:

Merge Join hint
Query patterns that rule out join type, for example Hash and Merge joins require at least one equality predicate. Writing a join on >= and <= does not trigger the event.

eajsrInvalidAdaptiveThreshold

This event can be triggered by various TOP, FAST N, and OFFSET/FETCH queries. Here are some examples:

SELECT uc.Id, uc.DisplayName, uc.Reputation, p.Score
FROM   dbo.Users_cx AS uc
INNER JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate <= '2008-01-01'
OPTION ( FAST 1);
GO 

SELECT TOP 1 uc.Id, uc.DisplayName, uc.Reputation, p.Score
FROM   dbo.Users_cx AS uc
INNER JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate <= '2008-01-01';
GO

In some circumstances, they can also be triggered by paging-style queries:

WITH pages
AS ( SELECT TOP 100 uc.Id, ROW_NUMBER() OVER ( ORDER BY uc.Id ) AS n
     FROM   dbo.Users_cx AS uc ),
     rows
AS ( SELECT TOP 50 p.Id
     FROM   pages AS p
     WHERE  p.n > 50 )
SELECT u.Id, u.Reputation
FROM   pages AS p
JOIN   dbo.Users AS u
ON p.Id = u.Id;

eajsrMultiConsumerSpool

No known query patterns have triggered this event yet.

What hasn't triggered it so far:

Recursive CTEs
Grouping sets/Cube/Rollup
PIVOT and UNPIVOT
Windowing functions

eajsrOuterCardMaxOne

A couple different types of queries have triggered this event. A derived join with a TOP 1, and a join combined with a WHERE clause with an equality predicate on a unique column:

SELECT uc.Id, uc.Reputation, p.Score
FROM   dbo.Users_cx AS uc
JOIN   (SELECT TOP 1 p2.OwnerUserId, p2.Score FROM dbo.Posts AS p2 ORDER BY Id) AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate <= '2009-08-01';
GO 

SELECT p.Id, p.ParentId, p.OwnerUserId
FROM dbo.Posts AS p
JOIN dbo.Users_cx AS uc
ON uc.Id = p.OwnerUserId
WHERE p.Id = 17333;
GO

eajsrOuterSideParallelMarked

One type of query that can trigger this event is a Recursive CTE.

WITH postparent AS 
    (
    SELECT p.Id, p.ParentId, p.OwnerUserId 
    FROM dbo.Posts_cx AS p
    WHERE p.Id = 17333

        UNION ALL

    SELECT p2.Id, p2.ParentId, p2.OwnerUserId
    FROM postparent pp
    JOIN dbo.Posts_cx AS p2
    ON pp.Id = p2.ParentId
    )
    SELECT pp.Id, pp.ParentId, pp.OwnerUserId, u.DisplayName
    FROM postparent pp
    JOIN dbo.Users AS u
    ON u.Id = pp.OwnerUserId
    ORDER BY pp.Id
    OPTION (USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE'));

The reason here appears to be that the recursive portion of the CTE, which causes a serial zone in the plan, disallows Batch Mode Adaptive Join choice.

eajsrUnMatchedOuter

This is the most common, and seems to occur when an index is used for the join that cannot support a seek. For instance, this query causes a Key Lookup:

SELECT uc.Id, uc.Reputation, p.Score, p.LastActivityDate 
FROM   dbo.Users_cx AS uc
JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate >= '20080101'
       AND uc.DisplayName = 'harrymc'
       AND p.Score > 1;

The resulting query chooses a Row Mode Nested Loops join to execute both the Key Lookup and the table joins, which triggers the event.

Another example is a query that skips the narrow nonclustered index in favor of the PK/CX. In this case, the PK/CX does not lead with OwnerUserId, so the only join choice is a Hash Join.

In both cases, the "unmatched outer" seems to indicate that the index chosen does not sufficiently cover our query.

SELECT uc.Id, uc.Reputation, p.*
FROM   dbo.Users_cx AS uc
JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
WHERE  uc.LastAccessDate >= '20160101';

Do Batch Mode Adaptive Joins work with multiple joins?

Yes, but as of this writing, there appears to be a limitation:

Joining from one ColumnStore index to multiple Row Store indexes will yield multiple Adaptive Joins, whereas a join between multiple ColumnStore indexes will not be Adaptive.

For example, these two queries

SELECT uc.Id, uc.Reputation, p.Score
FROM   dbo.Users_cx AS uc
JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
JOIN dbo.Comments AS c
ON c.PostId = p.Id
AND  c.UserId = uc.Id
WHERE  uc.LastAccessDate >= '20160101';

SELECT uc.Id, uc.Reputation, p.Score
FROM   dbo.Users_cx AS uc
JOIN   dbo.Posts AS p
ON p.OwnerUserId = uc.Id
JOIN dbo.Comments_cx AS c
ON c.PostId = p.Id
AND  c.UserId = uc.Id
WHERE  uc.LastAccessDate >= '20160101';

The first query joins one ColumnStore index (on Users) to two Row Store indexes on Posts and Comments. This yields two Adaptive Join operators.

The second query joins two ColumnStore tables (Users and Comments) to one Row Store table (Posts), and yields one Adaptive Join.

Is there any overhead to Batch Mode Adaptive Joins?

Yes, all Batch Mode Adaptive Join plans receive a memory grant. This isn't always the case with Nested Loops join, unless they receive the Nested Loops prefetch optimization. The memory grant is to support a Hash Join should the plan require one based on row thresholds.

SQL Server 2017 – How Batch Mode Memory Grant Feedback Works

What's Memory Grant Feedback All About?

Batch Mode Memory Grant Feedback attempts to right-size memory grants for queries, correcting for both over- and under-estimates.

When a query runs that requires a memory grant, the optimizer will ask for a grant of a size that it thinks will keep all row operations in memory.

Things that commonly require memory grants:

Sorts
Hash Joins

When too much memory is asked for and granted, concurrency may suffer as a result. Memory is a finite resource, and not everything can use all of it all the time. There's no such thing as unlimited access to such resources.

When too little memory is asked for, queries may spill to disk. Look, no one wants their queries spilling anywhere. Disk is dreadful.

How Does The Magic Work?

When a plan that requires a memory grant executes and is cached, actual memory needed to run the plan is recalculated, and the plan information is updated accordingly. Right now, it requires the presence of a ColumnStore index to achieve Batch execution mode.

What If The Magic Isn't So Magical?

This feature does have a terminating point in which it will fall back to the original memory grant. If queries run that need constant recalculating, our magical feature will give up. Eventually. As of this writing, I don't have all of the implementation details on when it will quit.

How Do I Know If It's Working?

You can use Extended Events:

You may also observe it during regular query tuning over multiple runs in the actual execution plan.

Can You Show Me An Example?

Of course! Here's a stored procedure. Under the right circumstances, it will ask for an incorrect memory grant.

CREATE OR ALTER PROCEDURE dbo.LargeUnusedGrant (@OwnerUserId INT)
AS 
BEGIN

    SELECT TOP 200 *
    FROM dbo.Posts_cx AS p
    WHERE p.OwnerUserId = @OwnerUserId
    AND p.PostTypeId = 1
    ORDER BY p.Score DESC, p.Id DESC;

END;
GO

The query plan has a warning on the select operator.

Sadness indeed! Our query asked for too much memory.

On second execution, the warning will disappear.

I've Heard About Some Sneaky Tricks...

Itzik Ben-Gan and Niko Neugebauer have both come up with workarounds to features and operators that require Batch Mode execution.

Create a filtered, non-clustered ColumnStore index with a WHERE clause that contains 0 rows
Join your table to a temp table with a Clustered ColumnStore index on it with no rows using a superfluous Left Join

Both methods are valid workaround to get this to work!

Creating an empty nonclustered ColumnStore index:

/*Itzik's Trizik*/
CREATE NONCLUSTERED COLUMNSTORE INDEX ncci_helper
    ON dbo.Posts
(
    Id,
    AcceptedAnswerId,
    AnswerCount,
    ClosedDate,
    CommentCount,
    CommunityOwnedDate,
    CreationDate,
    FavoriteCount,
    LastActivityDate,
    LastEditDate,
    LastEditorDisplayName,
    LastEditorUserId,
    OwnerUserId,
    ParentId,
    PostTypeId,
    Score,
    ViewCount,
    IsHot )
    WHERE ( Id = -2147483647 AND Id = 2147483647) -- eez impossible!

Running a different version of the proc against the table with the empty index:

CREATE OR ALTER PROCEDURE dbo.LargeUnusedGrant_alt1 (@OwnerUserId INT)
AS 
BEGIN

    SELECT TOP 200 *
    FROM dbo.Posts AS p
    WHERE p.OwnerUserId = @OwnerUserId
    AND p.PostTypeId = 1
    ORDER BY p.Score DESC, p.Id DESC;

END;
GO

Using the same value:

EXEC dbo.LargeUnusedGrant_alt1 @OwnerUserId = 8672;
GO

First run: sad memory grant

Second run: happy memory grant

Superfluous left join:

/*Niko's Triko*/
CREATE OR ALTER PROCEDURE dbo.LargeUnusedGrant_alt2 (@OwnerUserId INT)
AS 
BEGIN

    CREATE TABLE #t1 (Id INT, INDEX cx_whatever CLUSTERED COLUMNSTORE);

    SELECT TOP 200 *
    FROM dbo.Posts AS p
    LEFT JOIN #t1 ON 1 = 1
    WHERE p.OwnerUserId = @OwnerUserId
    AND p.PostTypeId = 1
    ORDER BY p.Score DESC, p.Id DESC;

END;
GO 

EXEC dbo.LargeUnusedGrant_alt2 @OwnerUserId = 8672;
GO