For the following schema and example data
CREATE TABLE T
(
A INT NULL,
B INT NOT NULL IDENTITY,
C CHAR(8000) NULL,
UNIQUE CLUSTERED (A, B)
)
INSERT INTO T
(A)
SELECT NULLIF(( ( ROW_NUMBER() OVER (ORDER BY @@SPID) - 1 ) / 1003 ), 0)
FROM master..spt_values
An application is processing the rows from this table in clustered index order in 1,000 row chunks.
The first 1,000 rows are retrieved from the following query.
SELECT TOP 1000 *
FROM T
ORDER BY A, B
The final row of that set is below
+------+------+
| A | B |
+------+------+
| NULL | 1000 |
+------+------+
Is there any way to write a query that just seeks into that composite index key and then follows it along to retrieve the next chunk of 1000 rows?
/*Pseudo Syntax*/
SELECT TOP 1000 *
FROM T
WHERE (A, B) is_ordered_after (@A, @B)
ORDER BY A, B
The lowest number of reads I've managed to get so far is 1020 but the query seems far too convoluted. Is there a simpler way of equal or better efficiency? Perhaps one that manages to do it all in one range seek?
DECLARE @A INT = NULL, @B INT = 1000
;WITH UnProcessed
AS (SELECT *
FROM T
WHERE ( EXISTS(SELECT A
INTERSECT
SELECT @A)
AND B > @B )
UNION ALL
SELECT *
FROM T
WHERE @A IS NULL AND A IS NOT NULL
UNION ALL
SELECT *
FROM T
WHERE A > @A
)
SELECT TOP 1000 *
FROM UnProcessed
ORDER BY A,
B
FWIW: If the column A
is made NOT NULL
and a sentinel value of -1
is used instead the equivalent execution plan certainly looks simpler
But the single seek operator in the plan still performs two seeks rather than collapsing it into a single contiguous range and the logical reads are much the same so I'm suspecting that maybe this is pretty much as good as it will get?
Best Answer
A favourite solution of mine is to use an
API
cursor:The overall strategy is a single scan that remembers its position between calls. Using an
API
cursor means we can return a block of rows rather than one at a time as would be the case with aT-SQL
cursor:The
STATISTICS IO
output is: