Mongodb – Speeding up MongoBD query { $ne: [] }

mongodb

I'm currently doing a MongoDB aggregation but then have a query looking for all documents where a certain field, whose value is an array, is empty.

obs = db.collection.count({'things.titles': {'$ne': [] }})

To ensure this query uses an index, do I just need to do something like this?:

db.collection.ensureIndex({'things.titles': 1})

I understand this creates a multikey index, but it still takes a very long time (more than an hour) on a collection sized at 4739208 documents.

Best Answer

Your question specifies:

... a query looking for all documents where a certain field, whose value is an array, is empty.

Your query is count all documents where the array is not empty, is this intended?

Depending upon the cardinality of the values of this field, this could be causing these values to be expensive.

To query for empty arrays, the operation would be:

 db.collection.explain().count( 'things.titles' : [ ] } )

Note that there are edge cases to this query. For example, the following would be returned:

  { "_id" : ..., "things.titles" : [ [ ] ] }

The index usage of the count operation can be determined using the collection.explain() function.

For example:

 db.collection.explain().count({'things.titles': {'$ne': [] }})

The output will provide information showing the index bounds being used (if they exist).