MongoDB replicaset number of connections too high

mongodbreplication

I'm using a 3 node replicaset with 2 data nodes(version 2.6.1 and 2.6.3) and 1 arbiter node. I get frequent alerts on number of connections >1000, whereas total databases are about 400 and also I'm using connection pooling with nodeJS.

With another standalone instance where number of databases are much more than that in first scenario I get number of connections are mostly less than 400.

How many connections can I expect normaly(so that I can decide a threshold)?

Best Answer

As Antonis said, the number of connections has little relation to the number of databases.

In general, the number of connections aren't something to worry about. The MongoDB drivers keep the connections alive to reuse them and to prevent the overhead of setting up new connections.

However, each connection is provided with about 1MB of stack server side. Unnecessary connections might eat up precious RAM, which is used by MongoDB to for the indices and as much of the working set as possible in order to speed things up.

In case you have enough RAM on your server, you have nothing to worry about – just adjust your alert thresholds to more suitable numbers. If you have a lot of page faults, however, you should investigate a bit further.

Since you are using connection pooling, it is safe to assume that you either more concurrent connections on your application than your MongoDB server (hardware, that is) can handle, or you are opening unnecessary connections.

High number of concurrent connections

As a rule of thumb, your MongoDB should be able to handle as many connections as you have concurrent request. In order to give you a decent amount of time to scale out when you reach the server's limits, your alert should trigger at about 80% utilization. For example, let's assume your server can handle about 1500 connections easily, your alert go off at

1500 * 0.8 = 1200 connections

If your server gets into problems with the 1000 connections you mentioned or when you hit 80% utilization, you should first scale up , for example by putting more RAM into the machine or – more generally speaking – eliminate the limitation which prevents the server from handling this number of connections. Which point to scale up to is not easy to determine, but generally speaking, you want to scale up as long as you get more bang than you have to put bucks into it.

There is a point where the bang you get for each buck you put in decreases drastically – of course you want to stop scaling up a bit earlier. Now what can you do in case your server still does not meet your requirements? The answer is to scale out, which in MongoWorld means setting up a sharded cluster. A word of warning: While creating a sharded cluster is no rocket science, there are quite some caveats and pitfalls. Make sure you have read the documentation about sharding thoroughly before implementing a sharded cluster. A good consultant usually is worth the money, too.

That being said: Usually it is the application server which first reaches the limit of concurrent users it can handle, so have a close look there.

Multiple open connections per concurrent user

Usually, you request a connection from the pool by doing your stuff on the reused db object and the connection is returned transparently to the pool after the stuff is done (simplified, but should be sufficient in this context). The pool is handled by the client transparently. Each time you use MongoClient.connect, a connection pool is created. You should only call this method once per application and reuse it. Doublecheck if you follow the described pattern.

Conclusion

  1. Make sure you reuse the db object returned by MongoClient.connect
  2. Find out your number of concurrent users.
  3. Check wether the number of connections made to the replica set is much higher than the number of concurrent users. If not, everything is working as expected.
  4. If the numbers roughly match and you are experiencing problems from the database side (long response times, high latency), either scale up or out after you made sure that it is not your application slowing the responses down.