Mongodb – Is MongoDB index creation single threaded

index-tuningmongodbmongodb-4.0

I have rather big dataset – 2TB of text data, 18.000 millions of lines.

Each document is same: 5 fields: 4 strings + 1 integer.

All fields should be indexed and searched by.

That is: array of 3 string tokens, 1 separate string, 1 integer.

"t" : ["param","pam","tadam"],
"p" : "haha",
"f" : 3062632

I tried two ways:

  1. to import, then create index
  2. to create index, then import

Mongo journal is disabled, XFS, CentOS7.6, MongoDB 4.0.6
Tried very beefy server (72core, 144GB RAM, 6X SSD RAID)

  1. Speed was above million inserts in second. After insertion, which took less then a day, I started foreground indexing. I've set this parameter maxIndexBuildMemoryUsageMegabytes=10000

Command is:

db.runCommand( { createIndexes: "records", indexes: [ 
                { key: { "f" : 1 }, name: "find" },
                { key: { "p" : 1 }, name: "pind" },
                { key: { "t" : 1 }, name: "tind" }
    ]
  }
)

it applies fine, because mongo reports that "using bulk method; build may temporarily use up to 3333 megabytes of RAM"

speed is terrible, saturation of hardware is close to zero.
1 cpu core at 100%
SSD at 3%

  1. If I create index prior to inserting, speed starts from 100k inserts, but fast drops to single thousands of records per second at 200-300 million documents.

Can this problem be fixed with MongoDB?
Or another DBMS?

One cave: I need all dataset on one server, no clusters.

Best Answer

There is an OPEN issue already created in MongoDB JIRA for the same.

As per the issue, MongoDB index creation is not Multi-Threaded.

use multiple cores for index sort-phase

Multi-threaded index creation

UPDATE

Try increasing the maxIndexBuildMemoryUsageMegabytes parameter value.

The default value is 500 MB.

What does it do?

Limits the amount of memory that the simultaneous foreground index builds on one collection may consume for the duration of the builds.

So by increasing this limit, may increase the performance of index creation. For Example:

db.adminCommand( { setParameter: 1, maxIndexBuildMemoryUsageMegabytes: 70000 } )
Related Question