MongoDB – Using Too Much Memory

memorymongodbwiredtiger

We've been using MongoDB for several weeks now, the overall trend that we've seen has been that mongodb is using way too much memory (much more than the whole size of its dataset + indexes).

I've already read through this question and this question, but none seem to address the issue I've been facing, they're actually explaining what's already explained in the documentation.

The following are the results of htop and show dbs commands.

enter image description here

show dbs

I know that mongodb uses memory mapped IO, so basically the OS handles caching things in the memory, and mongodb should theoretically let go of its cached memory when another process requests free memory, but from what we've seen, it doesn't.

OOM kicks in an starts killing other important processes e.g. postgres, redis, etc. (As can be seen, to overcome this problem, we've increased the RAM to 183GB which now works but is pretty expensive. mongo's using ~87GBs of ram, nearly 4X of the size of its whole dataset)

So,

  1. Is this much memory usage really expected and normal ? (As per documentation, WiredTiger uses at most ~60% of RAM for its cache, but considering the dataset size, does it even have enough data to be able to take 86GBs of RAM ?)
  2. Even if the memory usage is expected, why won't mongo let go of its allocated memory in case another process starts requesting for more memory ? Various other running processes were being constantly killed by linux oom, including mongodb itself, before we increased the RAM and it made the system totally unstable.

Thanks !

Best Answer

Okay, so after following the clues given by loicmathieu and jstell, and digging it up a little, these are the things I found out about MongoDB using WiredTiger storage engine. I'm putting it here if anyone encountered the same questions.

The memory usage threads that I mentioned, all belonged to 2012-2014, all pre-date WiredTiger and are describing behavior of the original MMAPV1 storage engine which doesn't have a separate cache or support for compression.

The WiredTiger cache settings only controls the size of memory directly used by the WiredTiger storage engine (not the total memory used by mongod). Many other things are potentially taking memory in a MongoDB/WiredTiger configuration, such as the following:

  • WiredTiger compresses disk storage, but the data in memory are uncompressed.

  • WiredTiger by default does not fsync the data on each commit, so the log files are also in RAM which takes its toll on memory. It's also mentioned that in order to use I/O efficiently, WiredTiger chunks I/O requests (cache misses) together, that also seems to take some RAM (In fact dirty pages (pages that has changed/updated) have a list of updates on them stored in a Concurrent SkipList).

  • WiredTiger keeps multiple versions of records in its cache (Multi Version Concurrency Control, read operations access the last committed version before their operation).

  • WiredTiger Keeps checksums of the data in cache.

  • MongoDB itself consumes memory to handle open connections, aggregations, serverside code and etc.

Considering these facts, relying on show dbs; was not technically correct, since it only shows the compressed size of the datasets.

The following commands can be used in order to get the full dataset size.

db.getSiblingDB('data_server').stats()
# OR
db.stats()

This results is the following:

{
    "db" : "data_server",
    "collections" : 11,
    "objects" : 266565289,
    "avgObjSize" : 224.8413545621088,
    "dataSize" : 59934900658, # 60GBs
    "storageSize" : 22959984640,
    "numExtents" : 0,
    "indexes" : 41,
    "indexSize" : 7757348864, # 7.7GBs
    "ok" : 1
}

So it seems that the actual dataset size + its indexes are taking about 68GBs of that memory.

Considering all these, I guess the memory usage is now pretty expected, good part being it's completely okay to limit the WiredTiger cache size, since it handles I/O operations pretty efficiently (as described above).

There also remains the problem of OOM, to overcome this issue, since we didn't have enough resources to take out mongodb, we lowered the oom_score_adj to prevent OOM from killing important processes for the time being (Meaning we told OOM not to kill our desired processes).