SQL Server – Rejoining Range Seek on Nullable Composite Index

indexsql server

For the following schema and example data

CREATE TABLE T
  (
     A INT NULL,
     B INT NOT NULL IDENTITY,
     C CHAR(8000) NULL,
     UNIQUE CLUSTERED (A, B)
  )

INSERT INTO T
            (A)
SELECT NULLIF(( ( ROW_NUMBER() OVER (ORDER BY @@SPID) - 1 ) / 1003 ), 0)
FROM   master..spt_values 

An application is processing the rows from this table in clustered index order in 1,000 row chunks.

The first 1,000 rows are retrieved from the following query.

SELECT TOP 1000 *
FROM   T
ORDER  BY A, B 

The final row of that set is below

+------+------+
|  A   |  B   |
+------+------+
| NULL | 1000 |
+------+------+

Is there any way to write a query that just seeks into that composite index key and then follows it along to retrieve the next chunk of 1000 rows?

/*Pseudo Syntax*/
SELECT TOP 1000 *
FROM   T
WHERE (A, B) is_ordered_after (@A, @B)
ORDER  BY A, B 

The lowest number of reads I've managed to get so far is 1020 but the query seems far too convoluted. Is there a simpler way of equal or better efficiency? Perhaps one that manages to do it all in one range seek?

DECLARE @A INT = NULL, @B INT = 1000

;WITH UnProcessed
     AS (SELECT *
         FROM   T
         WHERE  ( EXISTS(SELECT A
                         INTERSECT
                         SELECT @A)
                  AND B > @B )
         UNION ALL
         SELECT *
         FROM   T
         WHERE @A IS NULL AND A IS NOT NULL
         UNION ALL
         SELECT *
         FROM   T
         WHERE A > @A        
         )
SELECT TOP 1000 *
FROM   UnProcessed
ORDER  BY A,
          B 

enter image description here


FWIW: If the column A is made NOT NULL and a sentinel value of -1 is used instead the equivalent execution plan certainly looks simpler

enter image description here

But the single seek operator in the plan still performs two seeks rather than collapsing it into a single contiguous range and the logical reads are much the same so I'm suspecting that maybe this is pretty much as good as it will get?

Best Answer

Is there any way to write a query that just seeks into that composite index key and then follows it along to retrieve the next chunk of 1000 rows?

A favourite solution of mine is to use an API cursor:

SET NOCOUNT ON;
SET STATISTICS IO ON;

DECLARE 
    @cur integer,
    -- FAST_FORWARD, AUTO_FETCH, AUTO_CLOSE, CHECK_ACCEPTED_TYPES, FAST_FORWARD_ACCEPTABLE
    @scrollopt integer = 16 | 8192 | 16384 | 32768 | 1048576,
    -- READ_ONLY, CHECK_ACCEPTED_OPTS, READ_ONLY_ACCEPTABLE
    @ccopt integer = 1 | 32768 | 65536, 
    @rowcount integer = 1000,
    @rc integer;

-- Open the cursor and return (up to) the first 1000 rows
EXECUTE @rc = sys.sp_cursoropen
    @cur OUTPUT,
    N'
    SELECT A, B, C
    FROM T
    ORDER BY A, B;
    ',
    @scrollopt OUTPUT,
    @ccopt OUTPUT,
    @rowcount OUTPUT;

IF @rc <> 16 -- FastForward cursor automatically closed
BEGIN
    -- Name the cursor so we can use CURSOR_STATUS
    EXECUTE sys.sp_cursoroption
        @cur, 
        2, 
        'MyCursorName';

    -- Until the cursor auto-closes
    WHILE CURSOR_STATUS('global', 'MyCursorName') = 1
    BEGIN
        EXECUTE sys.sp_cursorfetch
            @cur,
            2,
            0,
            1000;
    END;
END;

SET STATISTICS IO OFF;

The overall strategy is a single scan that remembers its position between calls. Using an API cursor means we can return a block of rows rather than one at a time as would be the case with a T-SQL cursor:

Execution plans

The STATISTICS IO output is:

Table 'T'. Scan count 1, logical reads 1011, physical reads 0, read-ahead reads 0
Table 'T'. Scan count 1, logical reads 1001, physical reads 0, read-ahead reads 0
Table 'T'. Scan count 1, logical reads 516, physical reads 0, read-ahead reads 0