I have a kubernetes cluster on bare metal with 3 machines having 8Gb of ram each. All my apps and a mongodb 4.0.9 replicaset runs on it.
There is an import program that:
- Download a 8Gb mongo dump from external source and restore it in fresly created database A. There is one collection whitout indexes.
- Fully browse the restored collection with one find({}) and 1M next().
- Emit amqp messages for each doc, then messages are stored in mongo database B, C, D.
- Drop the database A.
Part 2 use a lot of ram.
How should I configure mongo database A (and let database B normal) to reduce at max the footprint of the import operation? (Because more important tasks are running in the cluster)
For example, can I configure mongo to not create cache for the temporary collection?
Best Answer
Starting in MongoDB 3.4, the default WiredTiger internal cache size is the larger of either:
For example, In your case from the 8GB of RAM the WiredTiger cache will use 3.5GB of RAM (0.5 * (8 GB - 1 GB) = 3.5 GB).
This can be reduced to for example to 1GB by using the following admin command only when the
mongorestore
operation happens.and revert this back to 3.5 GB once the restoration is completed.
Note:
By reducing the wired tiger cache size will increase the time to restore the data.