SQL Server – Resolving Performance Issues with BETWEEN Join and Eager Spool

performancequery-performancesql serversql server 2014

This query (anonymized) takes about 2 minutes to execute.

SELECT  
    ly.Col1 
    ,sr.Col2
    ,sr.Col3
    ,sr.Col4
INTO TempDb..TempLYT
FROM Tempdb..T1 ly
JOIN TempDb..T2 sr on sr.[DateTimeCol] BETWEEN ly.DateTimeStart and ly.DateTimeEnd
WHERE sr.Col5 = 1 OR sr.Col5 = 2

Query Plan

Are there some alternatives that would help with this query?

After applying the index suggested by Paul White, the query plan looks like this:

Optimized Query Plan

Best Answer

Ensure you have an index on T1 like:

CREATE INDEX i ON T1 (dtstart, dtend) INCLUDE (col1);

You might reverse the key columns depending on which is more selective with your data set.

The optimizer tends to introduce an eager spool for one of the inequalities and leave the second one in a Filter. Repro:

CREATE TABLE #T1 (col1 integer, dtstart datetime, dtend datetime);
CREATE TABLE #T2 (col2 integer, col3 integer, col4 integer, col5 integer, dtcol datetime);

UPDATE STATISTICS #T1 WITH ROWCOUNT = 5000;
UPDATE STATISTICS #T2 WITH ROWCOUNT = 100000;

SELECT
    T1.col1,
    T2.col2,
    T2.col3,
    T2.col4
FROM #T1 AS T1
JOIN #T2 AS T2 ON T2.dtcol BETWEEN T1.dtstart AND T1.dtend
WHERE T2.col5 IN (1, 2)
OPTION (QUERYTRACEON 8690); -- No perf spool

enter image description here

With the index:

enter image description here

Naturally, the seek is only for the first index column. The second will be a residual predicate.

Generally, interval queries can be hard to optimize. Dejan Sarka and Itzik Ben-Gan have some good coverage on the topic: