SQL Server 2016 – Why a SELECT Query Causes Writes

sql serversql-server-2016

I've noticed that on a server running SQL Server 2016 SP1 CU6 sometimes an Extended Events session shows a SELECT query causing writes.
For example:

The execution plan shows no obvious cause for the writes, such as a hash table, spool, or sort that could spill to TempDB:

Variable assignment to a MAX type or an automatic statistics update could also cause this, but neither was the cause of the writes in this case.

What else could the writes be from?

Best Answer

Clumsy

I couldn't remember if I included these in my original answer, so here's another couple.

Spools!

SQL Server has lots of different spools, which are temporary data structures stored off in tempdb. Two examples are Table and Index spools.

When they occur in a query plan, the writes to those spools will be associated with the query.

These will also be registered as writes in DMVs, profiler, XE, etc.

Index Spool

Table Spool

The amount of writes performed will go up with the size of the data spooled, obviously.

Spills

When SQL Server doesn't get enough memory for certain operators, it may spill some pages to disk. This primarily happens with sorts and hashes. You can see this in actual execution plans, and in newer versions of SQL server, spills are also tracked in dm_exec_query_stats.

SELECT deqs.sql_handle,
       deqs.total_spills,
       deqs.last_spills,
       deqs.min_spills,
       deqs.max_spills
FROM sys.dm_exec_query_stats AS deqs
WHERE deqs.min_spills > 0;

Tracking

You can use a similar XE session as the one I used above to see these in your own demos.

CREATE EVENT SESSION spools_and_spills
    ON SERVER
    ADD EVENT sqlserver.sql_batch_completed
    ( ACTION ( sqlserver.sql_text ))
    ADD TARGET package0.event_file
    ( SET filename = N'c:\temp\spools_and_spills' )
    WITH ( MAX_MEMORY = 4096KB,
           EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
           MAX_DISPATCH_LATENCY = 1 SECONDS,
           MAX_EVENT_SIZE = 0KB,
           MEMORY_PARTITION_MODE = NONE,
           TRACK_CAUSALITY = OFF,
           STARTUP_STATE = OFF );
GO

Related Solutions

Sql-server – How to row estimates be improved in order to reduce chances of spills to tempdb

I won't comment about spills, tempdb or hints because the query seems pretty simple to need that much consideration. I think SQL-Server's optimizer will do its job quite good, if there are indexes suited for the query.

And your splitting into two queries is good as it shows what indexes will be useful. The first part:

(select convert(bigint, Value) NodeId
 from Oav.ValueArray
 where PropertyId = 3331  
   and ObjectId = 3540233
   and Sequence = 2)

needs an index on (PropertyId, ObjectId, Sequence) including the Value. I'd make it UNIQUE to be safe. The query would throw error anyway during runtime if more than one rows were returned, so it's good to ensure in advance that this won't happen, with the unique index:

CREATE UNIQUE INDEX
    PropertyId_ObjectId_Sequence_UQ
  ON Oav.ValueArray
    (PropertyId, ObjectId, Sequence) INCLUDE (Value) ;

The second part of the query:

select Value
  from Oav.ValueArray
 where ObjectId = @a               
   and PropertyId = 2840

needs an index on (PropertyId, ObjectId) including the Value:

CREATE INDEX
    PropertyId_ObjectId_IX
  ON Oav.ValueArray
    (PropertyId, ObjectId) INCLUDE (Value) ;

If efficiency is not improved or these indexes were not used or there are still differences in row estimates appearing, then there would be need to look further into this query.

In that case, the conversions (needed from the EAV design and the storing of different datatypes in the same columns) are a probable cause and your solution of splitting (as @AAron Bertrand and @Paul White comment) the query into two parts seems natural and the way to go. A redesign so to have different datatypes in their respective columns might be another.

Sql-server – Why are queries causing spill to tempdb

There's several different questions in here:

Q: Why weren't the queries spilling before?

They were, but SQL Server Management Studio didn't surface this as a clear error prior to SQL 2012. It's a great example of why when you're doing performance tuning, you have to go deeper than the graphical execution plan.

Q: Why do queries spill to disk?

Because SQL Server didn't grant them enough memory to complete their operations. Perhaps the execution plan underestimated the amount of memory required, or perhaps the box is under memory pressure, or they're just big queries. (Remember, SQL Server uses memory for three things - caching raw data pages, caching execution plans, and workspace for queries. That workspace memory ends up being fairly small.)

Q: How can I reduce spills?

By writing sargable T-SQL statements, having up-to-date statistics, putting enough memory in the server, building the right indexes, and interpreting the execution plans when things don't work out the way you expected. Check out Grant Fritchey's book SQL Server Query Performance Tuning for detailed explanations of all of those.