Mongodb – How to compare 1000 items every minute over 1 million records in mongodb without pulling all the data

mongodbnosqloptimizationperformancequery-performance

I am developing a real time stock price alert system that gets new prices every 1 min. There are 1000 tradable assets from one exchange currently and I plan to add other exchanges soon and 1 million records are stored in the database currently. A user is able to add alerts on any of the assets and is to be notified in real time (every minute)

The user would add alerts of type "Alert me if XYZ reaches 100$" I am using redis to store the current prices and mongodb to store the alerts.If the current price of the asset is 90$, this alert is considered to be a "greater than" alert else it is a "less than" alert. This attribute is represented by direction in the database. Here's what a single alert in the database looks like…

{
    "_id" : //alert id,
    "2" : //user id,
    "3" : e1:XYZ:1, //exchange: asset: direction
    "4" : 100,
    "5" : 0
}

I am indexing this database over the field "3". After 1 min, when I fetch all the stock prices, I want to be able to find all the "greater than" alerts that are below the current price indicating that the price has risen and all the "lesser" alerts that are above the current price indicating that the price has fallen.

Approach 1

I dynamically build a query in mongodb that is of the form

db.getCollections("alerts").find({
$or: [
{3: "e1:XYZ:0", 4: {$gt: 100}},
{3: "e1:XYZ:1", 4: {$lt: 100}},
...2000 more assets here
]
})

The process above is limited by the 16 MB document size restriction applicable while querying Mongodb

Approach 2

I found the above approach was very slow and therefore decided to get all the alerts stored in the database and do the comparisons on my node.js server which takes rougly 2.5 seconds per round to fetch data currently

  • Right way to do 1000 comparisons like this (server side or mongodb side)?
  • Can I further reduce the amount of data being sent from the database
    to server in any way?

Thanks

Best Answer

If you add "_id" to your index {"3":1, "_id":1 } and then projection of {"_id":1, "3":1} to query, you get result directly from index in the memory (if you have enough memory to hold index in the memory).

Adition, you could use in-memory database engine.