You are right in that your example query would not use that index.
The query planner will consider using an index if:
- all the fields contained in it are referenced in the query
- some of the fields starting from the beginning are referenced
It will not be able to make use of indexes that start with a field not used by the query.
So for your example:
SELECT [id], [name], [customerId], [dateCreated]
FROM Representatives WHERE customerId=1
ORDER BY dateCreated
it would consider indexes such as:
[customerId]
[customerId], [dateCreated]
[customerId], [dateCreated], [name]
but not:
[name], [customerId], [dateCreated]
If it found both [customerId]
and [customerId], [dateCreated], [name]
its decision to prefer one over the other would depend on the index stats which depend on estimates of the balance of data in the fields. If [customerId], [dateCreated]
were defined it should prefer that over the other two unless you give a specific index hint to the contrary.
It is not uncommon to see one index defined for every field in my experience either, though this is rarely optimal as the extra management needed to update the indexes on insert/update, and the extra space needed to store them, is wasted when half of them may never get used - but unless your DB sees write-heavy loads the performance is not going to stink badly even with the excess indexes.
Specific indexes for frequent queries that would otherwise be slow due to table or index scanning is generally a good idea, though don't overdo it as you could be exchanging one performance issue for another. If you do define [customerId], [dateCreated]
as an index, for example, remember that the query planner will be able to use that for queries that would use an index on just [customerId]
if present. While using just [customerId]
would be slightly more efficient than using the compound index this may be mitigated by ending up having two indexes competing for space in RAM instead of one (though if your entire normal working set fits easily into RAM this extra memory competition may not be an issue).
Without combining the fields in your data set you could just create a compound index using both. No changes to your data necessary. To do so, just create the index like so:
db.collection.ensureIndex({"user" : 1, "domain" : 1})
Docs are here:
http://www.mongodb.org/display/DOCS/Indexes#Indexes-CompoundKeys
Once you have created such a compound key it essentially makes an index on the leftmost element (user in my example above) redundant, and so an index on user (if it exists) could be removed.
Don't forget that the query optimizer only runs every ~1000 queries, so you will have to hint() the index to make sure it is used if you are testing it out.
Best Answer
Hi It will affect but not greatly.what happens is whenever you made change in collection index will be updated and during index update your collection will be locked .So to overcome this you can use index in background by using: db.collection.createIndex( { last_updated: 1}, {background: true} )
it will let you use your collection while updating so you will never found any kind of disruption because of your index.
As you have mentioned that you do sorting on "last_updated".I strongly recommend you to use index .you will have high read performance at a minimal indexing cost.