MongoDB geospatial query with sort – performance issues

mongodbsortingspatial

I have query (which is very slow ~2,5s):

db.markers.find({ latlng: { '$within': { '$box': [ [ -16, -140 ], [ 75, 140 ] ] } } }).sort({_id: -1}).limit(1000)

When I run explain for this query I get

{
   "cursor" : "GeoBrowse-box",
   "isMultiKey" : false,
   "n" : 1000,
   "nscannedObjects" : 242331,
   "nscanned" : 242331,
   "nscannedObjectsAllPlans" : 242331,
   "nscannedAllPlans" : 242331,
   "scanAndOrder" : true,
   "indexOnly" : false,
   "nYields" : 1383,
    "nChunkSkips" : 0,
    "millis" : 2351,
    "indexBounds" : {
        "latlng" : [ ]
    },
    "lookedAt" : NumberLong(262221),
    "matchesPerfd" : NumberLong(242331),
    "objectsLoaded" : NumberLong(242331),
    "pointsLoaded" : NumberLong(0),
    "pointsSavedForYield" : NumberLong(0),
    "pointsChangedOnYield" : NumberLong(0),
    "pointsRemovedOnYield" : NumberLong(0),
    "server" : "xx:27017"
}

When I remove sort({_id: -1}) explain gives me (fast query 5 millis):

{
    "cursor" : "GeoBrowse-box",
    "isMultiKey" : false,
    "n" : 1000,
    "nscannedObjects" : 1000,
    "nscanned" : 1000,
    "nscannedObjectsAllPlans" : 1000,
    "nscannedAllPlans" : 1000,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 5,
    "indexBounds" : {
        "latlng" : [ ]
    },
    "lookedAt" : NumberLong(1000),
    "matchesPerfd" : NumberLong(1000),
    "objectsLoaded" : NumberLong(1000),
    "pointsLoaded" : NumberLong(0),
    "pointsSavedForYield" : NumberLong(0),
    "pointsChangedOnYield" : NumberLong(0),
    "pointsRemovedOnYield" : NumberLong(0),
        "server" : "xx:27017"
}

I have 2d index on latlng, desc index on _id and compound indexes.

db.markers.ensureIndex({latlng: '2d', _id:-1})
db.markers.ensureIndex({ latlng: '2d' })
db.markers.ensureIndex({ _id: -1 })

What I want to achieve is to get markers from a particular area sorted from newest.

Any ideas or suggestions how to do that in much less time than 2.5 seconds?

Best Answer

A query in MongoDB can only use one index at a time, so it's a case of one or the other - it can't use the 2d index first, then do a sort on the _id index. In order to use indexes for both the selection and the sort, you would need a compound index like this:

db.markers.ensureIndex( { latlng : "2d" , _id : 1 } );

Try that, or similar and see how it impacts the results bearing in mind that once you define it, you can remove the original 2d index to save space and that this new index will have to be loaded into memory to be efficient.

Update: as mentioned in the summary, the above did not improve things, and the resulting query results in a scanAndOrder result. This also happens with range based queries, as explained in this excellent blog post:

http://blog.mongolab.com/2012/06/cardinal-ins/

As explained in that post, the usual resolution for range based query performance is to switch the order of the indexes. However this is currently not possible with geo indexes. There is a Jira issue already open for this here for voting and tracking purposes:

https://jira.mongodb.org/browse/SERVER-4247