SQL Server 2019 – Managing In-Memory Optimized TempDB Metadata Growth

memorymemory-optimized-tablessql serversql-server-2019tempdb

Problem

We have enabled sp_configure 'tempdb metadata memory-optimized' = 1, and now the tempdb meta data is taking over 400 GB on one of our servers and continues to grow. There are some drops in memory usage, but generally it keeps growing it's memory usage. We've had a couple times where the server actually crashes because there's not enough memory for the other system processes to modify tempdb and it brings down the whole server.

Question I am asking

How do I keep the SQL Server in-memory optimized tempdb metadata from growing continually and crashing my server? If anything, what are some other pieces of information that I can look into to find what's consuming so much memory?

Data about the problem

sys.dm_os_memory_clerks

The following query currently returns 438 GB.

SELECT SUM(domc.pages_kb / 1024.0 / 1024.0) AS pages_gb
FROM sys.dm_os_memory_clerks AS domc
WHERE domc.type LIKE 'MEMORYCLERK_XTP'

sys.dm_db_xtp_memory_consumers

The following query provides the data that the biggest usage of memory (290 GB) is memory_consumer_id of 113 – 'LOB Page Allocator'. It has no object_id or xtp_object_id, so I'm guessing that it's a database wide object.

SELECT ddxmc.memory_consumer_id
     , ddxmc.memory_consumer_type_desc
     , ddxmc.memory_consumer_desc
     , ddxmc.object_id
     , ddxmc.xtp_object_id
     , ddxmc.used_bytes / 1024.0 / 1024.0 / 1024.0 AS used_gb
FROM sys.dm_db_xtp_memory_consumers AS ddxmc
ORDER BY ddxmc.allocated_bytes DESC

Environment

Version: SQL Server 2019 CU9 – Enterprise
Memory on box: 3 TB
Instance Type: Fail-Over Clustered Instance
OS: Windows Server 2019 Standard
Number of CPU Cores: 80 Physical Cores (we've had to increase this number lately)
Number Of Tempdb Files: 64
Replication: This server is a publisher and subscriber to transactional replication.

Usage patterns

We are heavy users of tempdb. We are continually creating and dropping temp tables and table variables all the time in stored procedures. We need to work with bulk amounts of data, passing in lists of data that we then turn into tables to work with result set data instead of working with one piece of information at a time. Because of the heavy churn on tempdb we have had to implement the in-memory optimized tempdb to relieve PAGELATCH_* waits we were experiencing.

I am pretty sure there are NOT any user tables that are memory optimized. When I run the following query I only see SYSTEM_TABLE type objects:

SELECT *
FROM sys.dm_db_xtp_index_stats AS ddxis
    JOIN sys.objects AS o ON o.object_id = ddxis.object_id

Things I'm aware of and what I've tried

In-memory garbage collection can only clean up items older then the oldest transaction.

I'm quite aware that garbage collection can only happen on rows that are older then the oldest transaction, and so I have stopped all of our long running SQL Agent jobs and any other processes longer then 5 minutes. Unfortunately our memory usage didn't come back down. We do have old sessions, but none of them have any transactions open according to the following query.

SELECT *
FROM sys.dm_exec_sessions AS des
WHERE des.open_transaction_count > 0

Replication

To rule out replication I did temporarily stop the publishing and subscription agents for this server. No change in memory usage.

Checkpoint

I have run 'CHECKPOINT' in tempdb. Memory usage didn't come down.

DBCC FREEPROCCACHE – Getting rid of old temp tables

To drop old temp tables that are being cached I cleared the query plan cache and verified that the temp tables were re-created by running the following query. This did not result in the memory usage dropping any substantial amount.

SELECT *
FROM sys.tables AS t
WHERE t.name LIKE '#%'
    AND t.create_date < 'TimeOfClearingPlanCache'

Other Possible Solutions

Restarting SQL Server

We can restart SQL Server, and it does mitigate the problem for a while, but the memory usage does increase again after a while. While this is a work around, this is a terrible solution that we are not very fond of.

Turning Off 'In-Memory Optimized Tempdb'

We could turn off 'in-memory optimized tempdb', but we then would be subject to the heavy PAGELATCH_* waits we were experiencing before. We have 64 tempdb files to help reduce the contention that we were seeing, but even that wasn't enough during our heavy times. This could be an option, but it would be best if we could find why the memory usage is continually growing.

Best Answer

Pam Lahoud SQL Enterprise Team Principal Program Manager at Microsoft said on twitter:

Tweet

and contributed via comment here:

There are a few known issues that could be coming into play here. It's best to open a case with support if you can, the more data we have on these issues the more likely we'll be able to find a fix.