MongoDB high usage WiredTiger

memorymongodbwiredtiger

We are having difficulty with MongoDB high RAM usage in AWS EC2 Instance ubuntu with 30GB RAM. In our server, there are over 3K databases and It's out of memory from time to time. I read a lot about MongoDB memory usage most of them recommending reducing the WiredTigerCache size. But I am not sure that would help. The following information might help.

insert query update delete getmore command dirty  used flushes vsize   res qrw arw net_in net_out conn set repl                time
    *0    *0     *3     *0       0    39|0  0.4% 79.3%       0 13.1G 10.3G 0|0 1|0  6.26k   52.0k  251 rs0  SEC Nov 10 05:13:34.022
    *0    *0     *1     *0       0    58|0  0.4% 79.3%       0 13.1G 10.3G 0|0 1|0  10.4k   56.9k  250 rs0  SEC Nov 10 05:13:37.020
    *0    *0     *2     *0       0   120|0  0.4% 79.3%       0 13.1G 10.3G 0|0 2|0  22.6k   83.7k  250 rs0  SEC Nov 10 05:13:40.027
------------------------------------------------
MALLOC:    17644185184 (16826.8 MiB) Bytes in use by application
MALLOC: +    504643584 (  481.3 MiB) Bytes in page heap freelist
MALLOC: +    977738416 (  932.4 MiB) Bytes in central cache freelist
MALLOC: +      1190016 (    1.1 MiB) Bytes in transfer cache freelist
MALLOC: +    312620656 (  298.1 MiB) Bytes in thread cache freelists
MALLOC: +    137748736 (  131.4 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =  19578126592 (18671.2 MiB) Actual memory used (physical + swap)
MALLOC: +    488529920 (  465.9 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =  20066656512 (19137.1 MiB) Virtual address space used
MALLOC:
MALLOC:        1730087              Spans in use
MALLOC:           3006              Thread heaps in use
MALLOC:           4096              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
{
    "application threads page read from disk to cache count" : 1308217,
    "application threads page read from disk to cache time (usecs)" : 91179623,
    "application threads page write from cache to disk count" : 95640493,
    "application threads page write from cache to disk time (usecs)" : 1699078290,
    "bytes belonging to page images in the cache" : 5025574875,
    "bytes belonging to the cache overflow table in the cache" : 182,
    "bytes currently in the cache" : 12703066940,
    "bytes dirty in the cache cumulative" : NumberLong("4256396040375"),
    "bytes not belonging to page images in the cache" : 7677492064,
    "bytes read into cache" : 28764089566,
    "bytes written from cache" : NumberLong("2070527030210"),
    "cache overflow cursor application thread wait time (usecs)" : 0,
    "cache overflow cursor internal thread wait time (usecs)" : 0,
    "cache overflow score" : 0,
    "cache overflow table entries" : 0,
    "cache overflow table insert calls" : 0,
    "cache overflow table max on-disk size" : 0,
    "cache overflow table on-disk size" : 0,
    "cache overflow table remove calls" : 0,
    "checkpoint blocked page eviction" : 0,
    "eviction calls to get a page" : 3086214,
    "eviction calls to get a page found queue empty" : 1983708,
    "eviction calls to get a page found queue empty after locking" : 7223,
    "eviction currently operating in aggressive mode" : 0,
    "eviction empty score" : 0,
    "eviction passes of a file" : 1255755,
    "eviction server candidate queue empty when topping up" : 11915,
    "eviction server candidate queue not empty when topping up" : 2358,
    "eviction server evicting pages" : 0,
    "eviction server slept, because we did not make progress with eviction" : 884827,
    "eviction server unable to reach eviction goal" : 0,
    "eviction server waiting for a leaf page" : 350079123,
    "eviction server waiting for an internal page sleep (usec)" : 0,
    "eviction server waiting for an internal page yields" : 0,
    "eviction state" : 128,
    "eviction walk target pages histogram - 0-9" : 1104474,
    "eviction walk target pages histogram - 10-31" : 150582,
    "eviction walk target pages histogram - 128 and higher" : 0,
    "eviction walk target pages histogram - 32-63" : 4,
    "eviction walk target pages histogram - 64-128" : 695,
    "eviction walk target strategy both clean and dirty pages" : 0,
    "eviction walk target strategy only clean pages" : 1255755,
    "eviction walk target strategy only dirty pages" : 0,
    "eviction walks abandoned" : 102722,
    "eviction walks gave up because they restarted their walk twice" : 1129365,
    "eviction walks gave up because they saw too many pages and found no candidates" : 2637,
    "eviction walks gave up because they saw too many pages and found too few candidates" : 165,
    "eviction walks reached end of tree" : 2359112,
    "eviction walks started from root of tree" : 1249356,
    "eviction walks started from saved location in tree" : 6399,
    "eviction worker thread active" : 4,
    "eviction worker thread created" : 0,
    "eviction worker thread evicting pages" : 1098213,
    "eviction worker thread removed" : 0,
    "eviction worker thread stable number" : 0,
    "files with active eviction walks" : 0,
    "files with new eviction walks started" : 1229747,
    "force re-tuning of eviction workers once in a while" : 0,
    "forced eviction - pages evicted that were clean count" : 2715,
    "forced eviction - pages evicted that were clean time (usecs)" : 5182,
    "forced eviction - pages evicted that were dirty count" : 1919,
    "forced eviction - pages evicted that were dirty time (usecs)" : 7555109,
    "forced eviction - pages selected because of too many deleted items count" : 3202,
    "forced eviction - pages selected count" : 6776,
    "forced eviction - pages selected unable to be evicted count" : 1031,
    "forced eviction - pages selected unable to be evicted time" : 65,
    "hazard pointer blocked page eviction" : 623,
    "hazard pointer check calls" : 1105566,
    "hazard pointer check entries walked" : 28919669,
    "hazard pointer maximum array length" : 2,
    "in-memory page passed criteria to be split" : 2230,
    "in-memory page splits" : 1111,
    "internal pages evicted" : 5255,
    "internal pages queued for eviction" : 1818,
    "internal pages seen by eviction walk" : 39433,
    "internal pages seen by eviction walk that are already queued" : 1642,
    "internal pages split during eviction" : 5,
    "leaf pages split during eviction" : 2742,
    "maximum bytes configured" : 15881732096,
    "maximum page size at eviction" : 76162,
    "modified pages evicted" : 103981,
    "modified pages evicted by application threads" : 0,
    "operations timed out waiting for space in cache" : 0,
    "overflow pages read into cache" : 0,
    "page split during eviction deepened the tree" : 0,
    "page written requiring cache overflow records" : 0,
    "pages currently held in the cache" : 242898,
    "pages evicted by application threads" : 0,
    "pages queued for eviction" : 1213436,
    "pages queued for eviction post lru sorting" : 1215321,
    "pages queued for urgent eviction" : 2479,
    "pages queued for urgent eviction during walk" : 0,
    "pages read into cache" : 1376530,
    "pages read into cache after truncate" : 227271,
    "pages read into cache after truncate in prepare state" : 0,
    "pages read into cache requiring cache overflow entries" : 0,
    "pages read into cache requiring cache overflow for checkpoint" : 0,
    "pages read into cache skipping older cache overflow entries" : 0,
    "pages read into cache with skipped cache overflow entries needed later" : 0,
    "pages read into cache with skipped cache overflow entries needed later by checkpoint" : 0,
    "pages requested from the cache" : 60531186418,
    "pages seen by eviction walk" : 3226999,
    "pages seen by eviction walk that are already queued" : 828715,
    "pages selected for eviction unable to be evicted" : 1870,
    "pages selected for eviction unable to be evicted as the parent page has overflow items" : 0,
    "pages selected for eviction unable to be evicted because of active children on an internal page" : 1246,
    "pages selected for eviction unable to be evicted because of failure in reconciliation" : 0,
    "pages selected for eviction unable to be evicted due to newer modifications on a clean page" : 0,
    "pages walked for eviction" : 23891436,
    "pages written from cache" : 95646477,
    "pages written requiring in-memory restoration" : 16,
    "percentage overhead" : 8,
    "tracked bytes belonging to internal pages in the cache" : 30689239,
    "tracked bytes belonging to leaf pages in the cache" : 12672377701,
    "tracked dirty bytes in the cache" : 158416275,
    "tracked dirty pages in the cache" : 6130,
    "unmodified pages evicted" : 998576
}

Best Answer

For those who might have the same issue, I reduced the memory usage significantly by changing the WiredTiger cache size to 8GB for 30GB RAM and 4GB for 16GB RAM servers. Now our app works smooth.