Mongodb – Optimizing queries doing filtering on unindexed fields in large MongoDB collection

mongodboptimizationperformancequery-performance

We have a large MongoDB collection, 6TB and growing a lot. The collection is used for user and automated feedback, and as such will be used for all sorts of analytics. One document has about 20 fields. Since collection is used for analytics, filtered fields can be of any combination and as such, creating indexes for every combination would be too much.

Is there a good way to index collection for such (random-like) queries? If not, is there a way to optimize queries themselves?

If this can help with optimizing, queries are paged.

Example query:

db.usage.find({"appVersion": "4.0.0.0", "expression": "abcd"})

where appVersion is not indexed.

Best Answer

Analyze what are "most common" queries and create indexes for "top ten" queries, including only two first "columns" (keys). Remember to create only one index for key pair (if you have keys A and B, create only index A,B and not B,A). And keep only one key as "first" in index. (so If you have that A,B index and you should make A,C, make it as C,A).

After that create "single key indexes" for those keys what are NOT created on at the first step. (So single key index for key B, because it's not "first" key in indexes A,B or C,A)

You can use DEX to analyze "missing indexes".