SQL Server Performance – Sort Spills to TempDB Due to VARCHAR(MAX)

cardinality-estimatesperformancequery-performancesql servertempdb

On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:

CREATE TABLE [dbo].[Settings](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [resourceId] [int] NULL,
    [typeID] [int] NULL,
    [remark] [varchar](max) NULL,
    CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[Resources](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [resourceUID] [int] NULL,
 CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO

with following non-clustered indexes:

CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]
(
    [resourceUID] ASC
)

CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]
(
    [resourceId] ASC,
    [typeID] ASC
)

The database is configured with compatibility level 120.

When I run this query there are spills to tempdb.
This is how I execute the query:

exec sp_executesql N'
select r.id,remark
FROM Resources r
inner join Settings on resourceid=r.id
where resourceUID=@UID
ORDER BY typeID',
N'@UID int',
@UID=38

If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.

So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.

Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?

Adding OPTION (RECOMPILE) to the query makes no difference.

Best Answer

There are going to be several possible workarounds here.

You can manually adjust the memory grant, though I probably wouldn't go that route.

You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.

WITH CTE AS (
SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID
FROM Resources r
inner join Settings s on resourceid=r.id
where resourceUID=@UID
ORDER BY s.typeID
)
SELECT c.ID, ca.remark
FROM CTE c
CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)
ORDER BY c.typeID

Proof-of-concept dbfiddle here. Sample data would still be appreciated!

If you want to read an excellent analysis by Paul White, read here.