Sql-server – Hash join spills to disk even though there is plenty of memory granted for the query

execution-planmemorysql server

We are running SQL Server 2014 Enterprise Sp2 with 1.5TB of memory and 64 cores. There is close to 17 GB memory granted to this single threaded query, max memory used is 2.2gb, even though the hash join has plenty of memory available it seems like it's not using it and spilling to disk. Any idea why? Thank you in advance.

Same question posted in SQLPerformance. The Query plan and images are posted there

I want to understand the Memory fractions. It shows the hash join has 52.48% of the memory grant (total 17GB+) which is close to 9GB. But the maximum memory used of the plan is a lot less which is 2.2+GB and Hash join is the only spill in this query. Is there something I am missing about the memory fractions, are they not accurate?

Best Answer

SQL Server uses input and output memory fractions on each memory-consuming operator as part of assessing memory grant reuse opportunities during query execution.

The hash inner join shares memory grant with the hash anti semi join during its input phase (while the inner join is reading rows to build its hash table). The sum of the two input memory fractions is 0.524805 (inner join) plus 0.475195 (anti semi join) = 1 (fraction of the whole memory grant).

However this does not mean the hash inner join can consume 0.524805 * 17,060,288KB ~= 8,952,960KB despite what the spill warning text says. This is because the same hash table still exists while the hash inner join is producing rows. During this phase of its execution, it shares memory grant with the Sort at node 11, which is reading rows. This combination of concurrently executing operators is captured by the output memory fraction for the inner hash join.

The output memory fraction for the inner hash join is 0.0648054. Adding this to the sort's input fraction (0.876515) and the anti semi join's output fraction (0.0586793) again sums to 1.

The output memory fraction for the inner hash join is 0.0648054, which only allows 0.0648054 * 17,060,288 = 1,105,598KB of memory grant. The hash table must fit within this amount of memory, or it will spill. The runtime used memory reported in the spill text is 1,677,120KB - the granted 1,105,598KB plus 79,696 8KB pages spilled to tempdb.

The fundamental point is that memory grant calculations are proportional to the estimated size of the data (and sometimes the number of unique values). Both row count (cardinality) and average row size are important. If these are wrong, the memory-consuming operator may spill.

All the normal actions appropriate to improving cardinality/density/average row size estimates may be applicable, it is impossible to say from the anonymized plan.

My preference would be to adopt columnstore and batch mode processing. Aside from the usual benefits, it also allows for sorts and hashes to acquire additional memory dynamically (documented and supported trace flag(s) may be required).

_{Extract of my full answer posted on the copy question at SQLPerformance.com Q & A}

Best Answer

Related Solutions

Sql-server – How to determine cause of runtime increase given two query plans with SpillToTempDb warning

Sql-server – Memory Utilization

Related Question