Mongodb group by queries

mongodbmongodb-3.0

Lets say one of my items in a collection looks like that:

{
User:123
Grade : 100
}

I want to sum up the grades for each student so i have this output:

123: 100, 445: 30 etc

Im normal sql, its the equivalent of a group by + sum() query. How can i do it with mongo?

Best Answer

You will have to use the mongodb aggregation framework.

Check this example: https://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/

Related Solutions

MongoDB Not Using All Available RAM

The resident memory size represents the number of pages in memory actually touched by the mongod process. If that is significantly lower than the available memory and data exceeds the available memory (yours does), then it could be a case of simply not having actively touched enough pages yet.

To determine if this is the case, you should run free -m, the output should look something like this:

free -m
             total       used       free     shared    buffers     cached
Mem:          3709       3484        224          0         84       2412
-/+ buffers/cache:        987       2721
Swap:         3836        156       3680

In my example, cached is not close to the total, which means that not only has mongod not touched enough pages, the filesystem cache has not yet even been filled by pages being read from the disk in general.

A quick remedy for this would be the touch command (added in 2.2) - it should be used with caution on large data sets as it will attempt to load everything into RAM even if the data is far too large to fit (causing a lot of disk IO and page faults). It will certainly fill up the memory effectively though :)

If your cached value is close to the total available, then your issue is that a large number of pages being read into memory from disk are not relevant to (and hence not touched by) the mongod process. The usual candidate for this kind of discrepency is readahead. I've already covered that particular topic elsewhere in detail, so I'll just link those two answers for future reading if necessary.

Mysql – mongodb equivlant query to this sql group by

First, 2.3 is a development branch that was turned into 2.4, you should not be using it any longer - hopefully you mean 2.2, which is still supported and developed (though it too will soon be end of life as of writing this answer).

The group method you mention runs server side javascript and is not going to be fast, especially in 2.2 which used the old spidermonkey engine and takes an exclusive javascript lock. Generally I would not recommend using it, especially with a sharded cluster, where it is unsupported.

Instead you should use the aggregation framework, which was added in 2.2, is improved in 2.4 and will be even better in 2.6. The framework includes a $group operator, and and many others that you can use in a "pipeline" to achieve the results you want. If that seems odd, then I recommend the pipeline explanation docs here - for anyone familiar with the Linux/Unix shell, the concept should be very familiar.

To do a group on your field you would do something like this (this would be more specific and easier to be specific if you had actually included a sample document and desired output):

db.mytbl.aggregate(
  [
    { $group : { _id : {date:"$date_col"} , count : { $sum : 1 } } }
  ]
)

You can find some more general examples here:

http://docs.mongodb.org/manual/applications/aggregation/

Best Answer

Related Solutions

MongoDB Not Using All Available RAM

Mysql – mongodb equivlant query to this sql group by

Related Question