MongoDB Primary Member of Replica Set High Memory Usage

configurationmemorymongodbmongodb-3.6

I currently have a sharded cluster up and running, and I have noticed that the memory usage of the primary member of each replica set is very high, it is 8GB although I started each member with the following:

mongod --auth --bind_ip_all --shardsvr --replSet a --smallfiles --oplogSize 1024 --keyFile /file/somewhere/

I thought (possibly naively) that the oplogSize would limit the amount of memory used.

Any guidance in how to solve this or highlighting the error of my ways is much appreciated.

Best Answer

Introduction

The oplog has nothing very little to do with memory consumption. It is a capped collection used as sort of a write-ahead log for operations to be replicated to other replica set members.

In general, MongoDB uses up to around 85% (give and take) of memory unless told otherwise. This memory is used to keep the indices and the "working set" (copies of the most recently used documents) in RAM to ensure optimal performance. While it is technically possible to limit the RAM used by MongoDB, it is a Very Bad Idea™ to do so, as you severely limit MongoDB's performance and make it basically impossible to detect when to scale out or up because of insufficient RAM.

TL;DR: If you have to ask how to limit the RAM utilised by MongoDB, you probably should not limit it, as you are unable to judge the side effects this step will introduce.

Limiting MongoDBs memory consumption

You basically have three options: Limit the cache size for the WiredTigers storage engine, use cgroups to limit the memory mongod can request from the OS or use Docker to do so (which makes it a bit easier, but under the hood Docker uses cgroups as well, iirc).

Option 1: Limit WiredTigers cache size

Add the following option to your configuration file (I assume it is in YAML format):

storage:
 wiredTiger:
  engineConfig:
     cacheSizeGB: <number>

where <number> is the maximum amount of RAM MongoDB is allowed to use for WiredTiger's cache. Note that fiddling with this parameter can severly impact performance (on the other hand, limiting MongoDB's memory consumption always will). Please also note that this does not limit the memory used by mongod itself (for example, each connection gets a small stack assigned).

Option2: Using cgroups to limit the overall memory consumption of mongod

As a root user, first ensure that cgroups are enabled:

$ lscgroup
cpuset:/
cpu:/
cpuacct:/
memory:/
devices:/
freezer:/
net_cls:/
blkio:/

Assuming cgroups are available, you can now configure a control group for MongoDB's memory consumption in /etc/cgconfig.conf:

group mongodb{
    memory {
        memory.limit_in_bytes = 512m;
    }
}

After you have done so, you need to restart the cgconfig service. Do not simply copy and paste the config above: With 512 MB, MongoDB will bearly run (if at all). Adjust the memory limit to your needs, with at least 2GB of RAM.

Next, you need to assign mongod to the control group you just created. To do so, you need to edit /etc/cgrules.conf:

*:mongod memory mongodb/

where * denotes that this rule applies regardless who started mongod, the limit will be applied to RAM according to the rules of the control group mongod/. As a last step, you now need to restart the cgred and MongoDB services. The mongod now should use only the specified amount of RAM, for the better or worse.

Option 3: Use Docker to limit mongod's overall memory consumption:

  1. Identify which version of MongoDB you are running currently

    $ mongod -version
    db version v3.4.10
    git version: 078f28920cb24de0dd479b5ea6c66c644f6326e9
    OpenSSL version: OpenSSL 1.0.2n  7 Dec 2017
    allocator: system
    modules: none
    build environment:
       distarch: x86_64
       target_arch: x86_64
    

    Your output may be different, but we only need the db version, namely the minor version. In this example, it is "3.4".

  2. Pull a suitable docker image

    $ docker pull mongo:3.4
    

    You should pull the docker image for the version you determined earlier and use the pulled image in the next step.

  3. Run the docker image with the appropriate parameters

     $ docker run -d --name mongod --memory=512m \
     > --mount type=bind,src=/path/to/your/datafiles,dst=/data/db \
     > --mount type=bind,src=/file/somewhere/,dst=/key.file,readonly
     > mongod <yourOptions>
    

    A few things to note here: The first mount makes your existing datafiles accessible from inside the container, while the second mount does the same for your keyfile. You need to adjust your mongod options accordingly, namely the keyFile option to point to the destination you mounted your keyfile to. See the docker documentation and the README of the mongo docker image for details.

Conclusion

You have a sharded cluster and you want to limit the memory consumption of the individual shard members.

In case we are talking of a production system, this is a Very Bad Idea™: Either you have other services running on the machine running your mongods (which would make the two services compete for resources in case of heavy load) or artificially limit the performance MongoDB will provide (by using the methods described above). This is bad systems design. Why did you shard the first place? Sharding is MongoDB's method of load balancing and scaling out (in case a limiting factor, say RAM, can not be scaled up any more because the bang you get for the buck is insufficient). I have a mantra which I repeat to customers (and to be honest occasionally to myself):

MongoDB instances bearing production data should run on dedicated machines. No exceptions!

Depending on the reasons you sharded the first place and how many shards you have, it may well be that you didn't even need to shard if you ran your cluster on dedicated machines. Do the math.

And even if it was a good idea to have your cluster nodes running other services, they are obviously under provisioned. Given the price of RAM compared with reduced performance, it is basically a no-brainer to scale your machines up with a decent amount of RAM rather than limiting it artificially to enforce a system design which is bad in the first place.

My advice for you is to not follow any of the above approaches. Instead, run your data bearing MongoDB instances on dedicated machines. Scale them up as long as you get an according bang for the buck RAM and IO-wise (CPU is rarely an issue) before you shard. As of the time of this writing that would be between 128 and 256GB RAM and a RAID 0 (in case you have a replica set, which you do have, don't you?) or RAID 10 ( in case your shards are not replica sets - shame on you ;) ) with SSDs. Shard only if

  • you have too many IOPS for a single machine to handle
  • you need more RAM than you could fit into your replica set members with a good bang for the buck
  • you have more data than a single machine can persist.

hth

PS Do not blame it on me or MongoDB if your performance goes south after you limited the RAM for the mongods.