Since there is already and answer submitted, and a useful and valid one at that, I do not want to distract from its own usefulness but there are indeed points to raise that go way beyond just a short comment. So consider this "augmentation", which is hopefully valid but primarily in addition to what has already been said.
The truth is to really consider "how your application uses the data", and to also be aware of the factors in a "sharded environment" as well as your proposed "container environment" that affect this.
The Background Case
The general take on the practice recommendation for co-locating the mongos
process along with the application instance is to obviate any network overhead required in order for the application to communicate with that mongos
process. Of course it is also "recommended practice" to specify a number of mongos
instances in the application connection string in the case where that "nearest" node should not be available for some reason then another could be selected, albeit with the possible overhead of contacting a remote node.
The "docker" case you mentions seems somewhat arbitrary. While it is true that one of the primary goals of containers ( and before that, something like BSD jails or even chroot ) is generally to achieve some level of "process isolation", there is nothing really wrong with running multiple processes as long as you understand the implications.
In this particular case the mongos
is meant to be "lightweight" and run as an "additional function" to the application process in a way that it is pretty much a "paired" part of the application itself. So docker images themselves don't have an "initd" like process but there is not really anything wrong with with running a process controller like supervisord ( for example ) as the main process for the container which then gives you a point of process control over that container as well. This situation of "paired processes" is a reasonable case and also a common enough ask that there is official documentation for it.
If you chose that kind of "paired" operation for deployment, then it does indeed address the primary point of maintaining a mongos
instance on the same network connection and indeed "server instance" as the application server itself. It can also be viewed in some way as a case where the "whole container" were to fail then that node in itself would simply be invalid. Not that I would recommend it, and in fact you probably should still configure connections to look for other mongos
instances even if these are only accessible over a network connection that increases latency.
Version Specific / Usage Specific
Now that that point is made, the other consideration here comes back to that initial consideration of co-locating the mongos
process with the application for network latency purposes. In versions of MongoDB prior to 2.6 and specifically with regard to operations such as with the aggregation framework, then the case there was that there would be a lot more network traffic and subsequent after processing work performed by the mongos
process for dealing with data from different shards. That is not so much the case now as a good deal of the processing workload can now be performed on those shards themselves before "distilling" to the "router".
The other case is your application usage patterns itself with regard to the sharding. That means whether the primary workload is in "distributing the writes" across multiple shards, or indeed being a "scatter-gather" approach in consolidating read requests. In those scenarios
Test, Test and then Test Again
So the final point here is really self explanatory, and comes down to the basic consensus of any sane response to your question. This is not a new thing for MongoDB or any other storage solution, but your actual deployment environment needs to be tested on it's "usage patterns" as close to actual reality just as much as any "unit testing" of expected functionality from core components or overall results needs to be tested.
There really is not "definitive" statement to say "configure this way" or "use in this way" that actually makes sense apart from testing what "actually works best" for your application performance and reliability as is expected.
Of course the "best case" will always be to not "crowd" the mongos
instances with requests from "many" application server sources. But then to allow them some natural "parity" that can be distributed by the resource workloads available to having at "least" a "pool of resources" that can be selected, and indeed ideally in many cases but obviating the need to induce an additional "network transport overhead".
That is the goal, but ideally you can "lab test" the different perceived configurations in order to come to a "best fit" solution for your eventual deployment solution.
I would also strongly recommend the "free" ( as in beer ) courses available as already mentioned, and no matter what your level of knowledge. I find that various course material sources often offers "hidden gems" to give more insight into things that you may not have considered or otherwise overlooked. The M102 Class as mentioned is constructed and conducted by Adam Commerford for whom I can attest has a high level of knowledge on large scale deployments of MongoDB and other data architectures. Worth the time to at least consider a fresh perspective on what you may think you already know.
The hard connection limit was actually 20,000 but it was removed in version 2.6 and you are now only limited by memory or open file descriptors (whichever you run out of first). That being said, with connections running into the thousands, even if you can manage it you will be using a lot of memory (1MB per connection for the stack).
At that point you start to evaluate other options for mongos
deployment. You can have a dedicated set of mongos
hosts and take advantage of some of the nice HA options built into the drivers and pass in an list of possible mongos processes to use. If you are concerned about latency, you could potentially collocate (some of) the mongos
processes on your shard hosts.
Basically you've started to hit the number of nodes where you are into more advanced design decisions and it really depends on several factors as to what is right for you now, and what is right for you in the future. If you have some money to throw at it, this is perfect fodder for the type of consulting MongoDB offer (disclosure: I was once one of the people delivering such consulting). If not, you are asking the right questions, so keep an eye on resource utilisation and see what makes sense for your deployment as you grow.
Best Answer
As Antonis said, the number of connections has little relation to the number of databases.
In general, the number of connections aren't something to worry about. The MongoDB drivers keep the connections alive to reuse them and to prevent the overhead of setting up new connections.
However, each connection is provided with about 1MB of stack server side. Unnecessary connections might eat up precious RAM, which is used by MongoDB to for the indices and as much of the working set as possible in order to speed things up.
In case you have enough RAM on your server, you have nothing to worry about – just adjust your alert thresholds to more suitable numbers. If you have a lot of page faults, however, you should investigate a bit further.
Since you are using connection pooling, it is safe to assume that you either more concurrent connections on your application than your MongoDB server (hardware, that is) can handle, or you are opening unnecessary connections.
High number of concurrent connections
As a rule of thumb, your MongoDB should be able to handle as many connections as you have concurrent request. In order to give you a decent amount of time to scale out when you reach the server's limits, your alert should trigger at about 80% utilization. For example, let's assume your server can handle about 1500 connections easily, your alert go off at
If your server gets into problems with the 1000 connections you mentioned or when you hit 80% utilization, you should first scale up , for example by putting more RAM into the machine or – more generally speaking – eliminate the limitation which prevents the server from handling this number of connections. Which point to scale up to is not easy to determine, but generally speaking, you want to scale up as long as you get more bang than you have to put bucks into it.
There is a point where the bang you get for each buck you put in decreases drastically – of course you want to stop scaling up a bit earlier. Now what can you do in case your server still does not meet your requirements? The answer is to scale out, which in MongoWorld means setting up a sharded cluster. A word of warning: While creating a sharded cluster is no rocket science, there are quite some caveats and pitfalls. Make sure you have read the documentation about sharding thoroughly before implementing a sharded cluster. A good consultant usually is worth the money, too.
That being said: Usually it is the application server which first reaches the limit of concurrent users it can handle, so have a close look there.
Multiple open connections per concurrent user
Usually, you request a connection from the pool by doing your stuff on the reused
db
object and the connection is returned transparently to the pool after the stuff is done (simplified, but should be sufficient in this context). The pool is handled by the client transparently. Each time you useMongoClient.connect
, a connection pool is created. You should only call this method once per application and reuse it. Doublecheck if you follow the described pattern.Conclusion
db
object returned byMongoClient.connect