MongoDB – Why Covered Count Query Still Fetches and Examines Documents

mongodbmongodb-3.2

I don't understand why a count query fetches documents into memory even though an index exists which should cover the query. The problem is that such a query can take more than one hour in production. It slows down the whole database because lot of data needs to be read from disk (75-100 MB/sec while the query is running). The used index in production is about 5 GB on each mongodb node. The total size of all indexes in production is 32 GB and fits perfectly in RAM since each node has 128 GB RAM.

I broke down the problem to a minimalistic, unsharded setup. I inserted the following types of documents:

  1. 3 documents without both fields shopId and missingSince
  2. 5 documents with field shopId:1 without field missingSince
  3. 7 documents with field shopId:1 and missingSince:null
  4. 13 documents with field shopId:1 and
    missingSince:ISODate("2017-05-22T07:52:40.831Z")

I created the non-sparse index {shopId:1, missingSince:1}.
The execution plan of the query count({"shopId":1, "missingSince":null}) indicated "totalDocsExamined" : 12 which means that 12 documents had to be fetched. These must be the 5 documents of point 2 plus the 7 documents of point 3. All these 12 documents should be in the index with shopId:1, missingSince:null, thus satisfying the query.

But why does mongodb still need to fetch and examine these 12 documents?

Here is my test collection:

rs1:PRIMARY> db.offer.find()
{ "_id" : 1, "v" : 1 }
{ "_id" : 2, "v" : 1 }
{ "_id" : 3, "v" : 1 }
{ "_id" : 4, "shopId" : 1, "v" : 1 }
{ "_id" : 5, "shopId" : 1, "v" : 1 }
{ "_id" : 6, "shopId" : 1, "v" : 1 }
{ "_id" : 7, "shopId" : 1, "v" : 1 }
{ "_id" : 8, "shopId" : 1, "v" : 1 }
{ "_id" : 9, "shopId" : 1, "missingSince" : null, "v" : 1 }
{ "_id" : 10, "shopId" : 1, "missingSince" : null, "v" : 1 }
{ "_id" : 11, "shopId" : 1, "missingSince" : null, "v" : 1 }
{ "_id" : 12, "shopId" : 1, "missingSince" : null, "v" : 1 }
{ "_id" : 13, "shopId" : 1, "missingSince" : null, "v" : 1 }
{ "_id" : 14, "shopId" : 1, "missingSince" : null, "v" : 1 }
{ "_id" : 15, "shopId" : 1, "missingSince" : null, "v" : 1 }
{ "_id" : 16, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 17, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 18, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 19, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 20, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 21, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 22, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 23, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 24, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 25, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 26, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 27, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }
{ "_id" : 28, "shopId" : 1, "missingSince" : ISODate("2017-05-22T07:52:40.831Z"), "v" : 1 }

Here is the output of explain():

rs1:PRIMARY> db.offer.explain(true).count({"shopId":1, "missingSince":null})
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test.offer",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [
                {
                    "missingSince" : {
                        "$eq" : null
                    }
                },
                {
                    "shopId" : {
                        "$eq" : 1
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "COUNT",
            "inputStage" : {
                "stage" : "FETCH",
                "filter" : {
                    "missingSince" : {
                        "$eq" : null
                    }
                },
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "keyPattern" : {
                        "shopId" : 1,
                        "missingSince" : 1
                    },
                    "indexName" : "shopId_1_missingSince_1",
                    "isMultiKey" : false,
                    "isUnique" : false,
                    "isSparse" : false,
                    "isPartial" : false,
                    "indexVersion" : 1,
                    "direction" : "forward",
                    "indexBounds" : {
                        "shopId" : [
                            "[1.0, 1.0]"
                        ],
                        "missingSince" : [
                            "[null, null]"
                        ]
                    }
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "executionStats" : {
        "executionSuccess" : true,
        "nReturned" : 0,
        "executionTimeMillis" : 0,
        "totalKeysExamined" : 12,
        "totalDocsExamined" : 12,
        "executionStages" : {
            "stage" : "COUNT",
            "nReturned" : 0,
            "executionTimeMillisEstimate" : 0,
            "works" : 13,
            "advanced" : 0,
            "needTime" : 12,
            "needYield" : 0,
            "saveState" : 0,
            "restoreState" : 0,
            "isEOF" : 1,
            "invalidates" : 0,
            "nCounted" : 12,
            "nSkipped" : 0,
            "inputStage" : {
                "stage" : "FETCH",
                "filter" : {
                    "missingSince" : {
                        "$eq" : null
                    }
                },
                "nReturned" : 12,
                "executionTimeMillisEstimate" : 0,
                "works" : 13,
                "advanced" : 12,
                "needTime" : 0,
                "needYield" : 0,
                "saveState" : 0,
                "restoreState" : 0,
                "isEOF" : 1,
                "invalidates" : 0,
                "docsExamined" : 12,
                "alreadyHasObj" : 0,
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "nReturned" : 12,
                    "executionTimeMillisEstimate" : 0,
                    "works" : 13,
                    "advanced" : 12,
                    "needTime" : 0,
                    "needYield" : 0,
                    "saveState" : 0,
                    "restoreState" : 0,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "keyPattern" : {
                        "shopId" : 1,
                        "missingSince" : 1
                    },
                    "indexName" : "shopId_1_missingSince_1",
                    "isMultiKey" : false,
                    "isUnique" : false,
                    "isSparse" : false,
                    "isPartial" : false,
                    "indexVersion" : 1,
                    "direction" : "forward",
                    "indexBounds" : {
                        "shopId" : [
                            "[1.0, 1.0]"
                        ],
                        "missingSince" : [
                            "[null, null]"
                        ]
                    },
                    "keysExamined" : 12,
                    "dupsTested" : 0,
                    "dupsDropped" : 0,
                    "seenInvalidated" : 0
                }
            }
        },
        "allPlansExecution" : [ ]
    },
    "serverInfo" : {
        "host" : "Kays MacBook Pro",
        "port" : 27017,
        "version" : "3.2.6",
        "gitVersion" : "05552b562c7a0b3143a729aaa0838e558dc49b25"
    },
    "ok" : 1
}

Best Answer

Since nobody could find a valid reason for this issue, I opened yesterday a mongodb bug report: https://jira.mongodb.org/browse/SERVER-29326

Mongodb engineers confirmed that's a bug. Unfortunately it's not mentioned in mongodb's documentation which would have saved us many hours tracing down the issue and deploy another schema design from the beginning.