Sql-server – Different execution plans in SQL Server for same query but different offset

execution-planfull-text-searchsql serversql-server-2019

I am doing performance experiments regarding the fulltext search feature of Microsoft SQL Server using the stackoverflow database.

I have a (seemingly simple) query that results in different execution plans (and vastly different run times) depending simply on the specified offset value.

This version of the query (offset 0) is slow:

SELECT Comments.Id, Comments.PostId, Comments.Score, Comments.Text, Comments.CreationDate, Comments.UserId
FROM Comments   
WHERE contains(Text,'SQL')
AND contains(Text,'Server')
AND contains(Text,'2016')
AND Score IS NOT NULL
ORDER BY Score DESC
OFFSET 0 rows
FETCH NEXT 100 ROWS ONLY
OPTION (RECOMPILE);
(100 rows affected)
Table 'Worktable'. Scan count 2159279, logical reads 4325811, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Comments'. Scan count 1, logical reads 12299107, physical reads 1661, page server reads 0, read-ahead reads 2466352, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 43515 ms,  elapsed time = 23474 ms.

Execution plan:
enter image description here

This version of the query (offset 2000), however, completes within fractions of a second:

SELECT Comments.Id, Comments.PostId, Comments.Score, Comments.Text, Comments.CreationDate, Comments.UserId
FROM Comments   
WHERE contains(Text,'SQL')
AND contains(Text,'Server')
AND contains(Text,'2016')
AND Score IS NOT NULL
ORDER BY Score DESC
OFFSET 2000 rows
FETCH NEXT 100 ROWS ONLY
OPTION (RECOMPILE);
(100 rows affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Comments'. Scan count 0, logical reads 11853, physical reads 0, page server reads 0, read-ahead reads 30, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 47 ms,  elapsed time = 20 ms.

Execution plan:
enter image description here

If, in the first query, I replace the ORDER BY / OFFSET 0 / FETCH NEXT 100 by a simple TOP 100, but of course, this is not a viable solution in practice, because I want that specific (and deterministic) order:

(100 rows affected)
Table 'Comments'. Scan count 0, logical reads 821, physical reads 0, page server reads 0, read-ahead reads 107, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 6 ms.

enter image description here

More info about the data and schema:

  • There are 74,428,966 rows in the table Comments.
  • Here is the table definition:
CREATE TABLE [dbo].[Comments](
    [Id] [int] NOT NULL,
    [PostId] [int] NULL,
    [Score] [int] NULL,
    [Text] [nvarchar](max) NULL,
    [CreationDate] [datetime] NULL,
    [UserId] [int] NULL,
 CONSTRAINT [PK_Comments] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
  • There is a full text search index on the Text column
  • Here are the other indexes:
CREATE NONCLUSTERED INDEX [Comments_CreationDate] ON [dbo].[Comments]
(
    [CreationDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [Comments_PostId] ON [dbo].[Comments]
(
    [PostId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [Comments_Score] ON [dbo].[Comments]
(
    [Score] ASC,
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [Comments_UserId] ON [dbo].[Comments]
(
    [UserId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]

My question is: Why are the execution plans different and what would be the recommended way to improve the situation, such that both queries use the fast execution plan?

Addendum:

Based on one of Nikitas suggestions, I tried an altered version of the two queries.

Slow:

WITH CTE AS (SELECT Id   
FROM Comments   
WHERE contains(Text,'SQL')
AND contains(Text,'Server')
AND contains(Text,'2016')
AND Score IS NOT NULL
ORDER BY Score DESC
OFFSET 0 rows
FETCH NEXT 100 ROWS ONLY
)
SELECT Comments.Id, Comments.PostId, Comments.Score, Comments.Text, Comments.CreationDate, Comments.UserId
FROM Comments
JOIN CTE on Comments.Id = CTE.Id
ORDER BY Comments.score DESC
OPTION (RECOMPILE);
(100 rows affected)
Table 'Worktable'. Scan count 2009360, logical reads 4025794, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Comments'. Scan count 1, logical reads 3915, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 2125 ms,  elapsed time = 2117 ms.

Execution plan

Fast (same query, except OFFSET=2000):

(100 rows affected)
Table 'Comments'. Scan count 9, logical reads 131034, physical reads 0, page server reads 0, read-ahead reads 1304, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 1548 ms,  elapsed time = 217 ms.

Execution plan

Best Answer

Different offset means different number of rows should be matched - 100 in the first case and 2100 in the second one. In the second case SQL Server may choose more expensive parallel execution plan but it may work faster by different reasons. For example, because of statistics misestimation the first query may work much slower than expected and it would be better to choose the second plan for it from the very beginning. To do this you could parametrize your query and use hint OPTIMIZE FOR @offset = 2000.
Also with proper indexing it may has a sense to use a subquery to select required row ids with offset first, and after that join other fields to these 100 ids rather than SELECT * with offset 2000 (this way SQL Server may need to sort a lot of large rows before it be able to omit first 2000 of them).
Please check the following query (for both offsets and with statistics time, io on enabled):

WITH CTE AS (SELECT id   
FROM Comments   
WHERE contains(Text,'SQL')
AND contains(Text,'Server')
AND contains(Text,'2016')
AND Score IS NOT NULL
ORDER BY Score DESC
OFFSET 0 rows
FETCH NEXT 100 ROWS ONLY
)
SELECT Comments.* 
FROM Comments
JOIN CTE on Comments.id = CTE.id
ORDER BY Comments.score DESC
OPTION (RECOMPILE)
;