Mongodb crash after multiple query attempts

mongodb

Every once in a while our mongo service crashes () after displaying log of multiple messages like this:

Sun Apr  6 01:11:38.648 [conn2149] getmore prod_db.tweets query: { query: {}, $snapshot: true } cursorid:2408233643061247141 ntoreturn:0 exhaust:1 keyUpdates:0 numYields: 25 lock
s(micros) r:120718 nreturned:901 reslen:4197371 122ms
Sun Apr  6 01:11:38.769 [conn2149] getmore prod_db.tweets query: { query: {}, $snapshot: true } cursorid:2408233643061247141 ntoreturn:0 exhaust:1 keyUpdates:0 numYields: 22 lock
s(micros) r:73717 nreturned:905 reslen:4196587 1

If I understand correctly, the "numYields" means that it has tried "numYields" times to run the query, but yielded. However, I don't know which process might be blocking it, and not why it is crashing. Any idea?

Best Answer

That query is simply a long running read, without any criteria (so it is running against all data). As it fetches back the data, it will be done in batches (based on your batch size) and then issue a getmore on the same cursor for the next set of results.

The numYields count does not mean the query is being blocked, it means that it yielded its lock when needed. This is usually done for a write, and usually when the original query had to page fault to disk to get data, then it resumes (when querying all data in a collection, this is going to happen often unless all your data + indexes fit in RAM).

Therefore, the query is not being blocked, in fact the getmore operations show that it is progressing over time - most long running reads will have a similar profile, especially if you are writing to the database at the same time.

It is also not likely that this query is the cause of any crash (it's just a read), it's more likely something else that is causing the crash, and you are equating this query with the crash because it happens to be running at the time when the crash occurs (people often suspect the serverStatus command for the same reason - it is run once a minute by MMS). I would recommend posting the full messaging around the crash as a separate question for proper diagnosis.

For what it's worth, with snapshot set to true, and the fact that it is reading all data, I suspect this is a mongodump query (it defaults to using snapshot to avoid duplicates being dumped when data is moved).