Sql-server – Memory Optimized table: Index size 10X bigger than data

memory-optimized-tablessql serversql-server-2017

On one of our production servers, I found out that Index size is 10X times bigger than data size.

These servers and schemas are identical with a similar workload.

CREATE TABLE [dbo].[process_aggregation_close]
(
    [organization_id] [bigint] NOT NULL,
    [computer_id] [uniqueidentifier] NOT NULL,
    [process_id] [int] NOT NULL,
    .
    .
INDEX [ix_process_aggregation_memory_computer_id_process_id] NONCLUSTERED 
(
    [organization_id] ASC,
    [computer_id] ASC,
    [process_id] ASC
)
)WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_ONLY )
GO

Tables are staging tables and they are involved in very frequent insert/delete operations.
sys.dm_db_xtp_table_memory_stats

SQL Server Version: MS SQL 2017 CU9 Standard edition

Best Answer

The size difference is due to how row versioning in indexes and the garbage collector of in memory tables work.

Looking at the garbage collection process used for memory optimized tables:

If there is no transactional activity after (for example) deleting a large number of rows and there is no memory pressure, the deleted rows will not be garbage collected until the transactional activity resumes or there is memory pressure

Okay, that makes sense for the indexes, but why is the data cleared while the indexes are not cleared?

My educated guess is that removing the row versioning from indexes is a different process than removing the overwritten data from the table. More on that here.

An example for this, with a query that inserts 1M rows and executes a full delete on a table that consists of the three columns you have in your question.

-- insert data into the tables  
SET NOCOUNT ON;
DECLARE @I int =0;
WHILE @I < 1000000
BEGIN
    INSERT INTO dbo.MemOptimizedTable([organization_id], [computer_id], [process_id])
    VALUES (cast(@I as bigint) + 2147483647, NEWID(),@I) ;
    SET @I +=1;
END
-- Delete everything
 DELETE FROM  dbo.MemOptimizedTable;

SELECT object_id, memory_used_by_table_kb, memory_used_by_indexes_kb 
FROM sys.dm_db_xtp_table_memory_stats;

After running the query the first time

object_id   memory_used_by_table_kb memory_used_by_indexes_kb
1045578763          62501           58852

Running the query 3 times gives us this starting point where the index size has grown beyond the table size

object_id   memory_used_by_table_kb memory_used_by_indexes_kb
965578478           62501           78464

While the insert of 1M records is running again, the mem used by table grows

--Insert running 
object_id   memory_used_by_table_kb memory_used_by_indexes_kb
965578478           86715            88471

And grows

--Insert running
object_id   memory_used_by_table_kb memory_used_by_indexes_kb
965578478           123628           104662

Until the insert is finished

--Insert done
object_id   memory_used_by_table_kb memory_used_by_indexes_kb
965578478           62500            79543

The index grew, the memory used by the table was removed after the insert and remained the same. After this insert finishes, a process is triggered to remove the excess memory that was used by the table.

This is not by any means the complete explanation but could be a starting point.

What can we do?

According to this post there’s also no way to force garbage collection to occur.

You could op to

  • Do nothing and let SQL Server handle it
  • Drop and create the index to reset the memory usage (if you have another index on the table)
  • ...

Commands to drop and create the index example

ALTER TABLE  dbo.table
       DROP INDEX [ix_process_aggregation_memory_computer_id_process_id];  

ALTER TABLE  dbo.table
       ADD INDEX [ix_process_aggregation_memory_computer_id_process_id] ([organization_id] ASC,[computer_id] ASC, [process_id] ASC);

Create table statement of the table used in the example

 CREATE TABLE dbo.MemOptimizedTable(   
  [organization_id] [bigint] PRIMARY KEY NONCLUSTERED NOT NULL,
    [computer_id] [uniqueidentifier] NOT NULL,
    [process_id] [int] NOT NULL,
INDEX [ix_process_aggregation_memory_computer_id_process_id] NONCLUSTERED 
(
    [organization_id] ASC,
    [computer_id] ASC,
    [process_id] ASC
)
)WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_ONLY)
GO