Since there is already and answer submitted, and a useful and valid one at that, I do not want to distract from its own usefulness but there are indeed points to raise that go way beyond just a short comment. So consider this "augmentation", which is hopefully valid but primarily in addition to what has already been said.
The truth is to really consider "how your application uses the data", and to also be aware of the factors in a "sharded environment" as well as your proposed "container environment" that affect this.
The Background Case
The general take on the practice recommendation for co-locating the mongos
process along with the application instance is to obviate any network overhead required in order for the application to communicate with that mongos
process. Of course it is also "recommended practice" to specify a number of mongos
instances in the application connection string in the case where that "nearest" node should not be available for some reason then another could be selected, albeit with the possible overhead of contacting a remote node.
The "docker" case you mentions seems somewhat arbitrary. While it is true that one of the primary goals of containers ( and before that, something like BSD jails or even chroot ) is generally to achieve some level of "process isolation", there is nothing really wrong with running multiple processes as long as you understand the implications.
In this particular case the mongos
is meant to be "lightweight" and run as an "additional function" to the application process in a way that it is pretty much a "paired" part of the application itself. So docker images themselves don't have an "initd" like process but there is not really anything wrong with with running a process controller like supervisord ( for example ) as the main process for the container which then gives you a point of process control over that container as well. This situation of "paired processes" is a reasonable case and also a common enough ask that there is official documentation for it.
If you chose that kind of "paired" operation for deployment, then it does indeed address the primary point of maintaining a mongos
instance on the same network connection and indeed "server instance" as the application server itself. It can also be viewed in some way as a case where the "whole container" were to fail then that node in itself would simply be invalid. Not that I would recommend it, and in fact you probably should still configure connections to look for other mongos
instances even if these are only accessible over a network connection that increases latency.
Version Specific / Usage Specific
Now that that point is made, the other consideration here comes back to that initial consideration of co-locating the mongos
process with the application for network latency purposes. In versions of MongoDB prior to 2.6 and specifically with regard to operations such as with the aggregation framework, then the case there was that there would be a lot more network traffic and subsequent after processing work performed by the mongos
process for dealing with data from different shards. That is not so much the case now as a good deal of the processing workload can now be performed on those shards themselves before "distilling" to the "router".
The other case is your application usage patterns itself with regard to the sharding. That means whether the primary workload is in "distributing the writes" across multiple shards, or indeed being a "scatter-gather" approach in consolidating read requests. In those scenarios
Test, Test and then Test Again
So the final point here is really self explanatory, and comes down to the basic consensus of any sane response to your question. This is not a new thing for MongoDB or any other storage solution, but your actual deployment environment needs to be tested on it's "usage patterns" as close to actual reality just as much as any "unit testing" of expected functionality from core components or overall results needs to be tested.
There really is not "definitive" statement to say "configure this way" or "use in this way" that actually makes sense apart from testing what "actually works best" for your application performance and reliability as is expected.
Of course the "best case" will always be to not "crowd" the mongos
instances with requests from "many" application server sources. But then to allow them some natural "parity" that can be distributed by the resource workloads available to having at "least" a "pool of resources" that can be selected, and indeed ideally in many cases but obviating the need to induce an additional "network transport overhead".
That is the goal, but ideally you can "lab test" the different perceived configurations in order to come to a "best fit" solution for your eventual deployment solution.
I would also strongly recommend the "free" ( as in beer ) courses available as already mentioned, and no matter what your level of knowledge. I find that various course material sources often offers "hidden gems" to give more insight into things that you may not have considered or otherwise overlooked. The M102 Class as mentioned is constructed and conducted by Adam Commerford for whom I can attest has a high level of knowledge on large scale deployments of MongoDB and other data architectures. Worth the time to at least consider a fresh perspective on what you may think you already know.
I will try to be brief. Most of the answers are documented.
Q1: There are many ways. Shard a collection using _id:hashed and set initial number of chunks 2 and then start inserting documents (100 is a good number of docs). You will see documents on both shards.
Q2: It uses range partitioning. Each shard holds ranges of the shard key.
Q3: Sharding is an automated process, all you need to do is to determine a shard key.
Q4: Stop the balancer, backup shard rs1, rs2 and the config server, start the balancer.
Q5: On a sharded cluster the access point should be the mongos
Best Answer
A MongoDB sharded cluster deployment can contain collections that are either unsharded (the default when created) or sharded (based on your chosen shard key). Sharding a collection is (as at MongoDB 3.4) a two-step process: enable sharding on a database using
sh.enableSharding()
and then shard specific collections usingsh.shardCollection()
. Sharded and unsharded collections coexist in the same sharded cluster and your application generally does not have to be aware of whether collections are sharded.If you have a sharded cluster deployment (irrespective of whether you have enabled sharding on individual collections) you should always make your queries and updates through a
mongos
instance. Themongos
instances handle details of routing queries to the correct shard(s) based on the current configuration and status of backingmongod
servers.The "generally unaware" application design caveat alluded to earlier is that a few operations (for example:
$lookup
and$graphLookup
) may not support using sharded collections as the source or destination. There are also some operational restrictions in a sharded cluster which are worth reviewing.Sharding alone does not improve redundancy. Sharding is used to partition data across multiple servers so you can have multiple MongoDB instances presented as a single logical deployment. Ideally your shards should be backed by replica sets so you have data redundancy and failover (which both rely on replication). I would strongly recommend upgrading a production deployment to a replica set before moving to a sharded cluster.
If you plan on eventually sharding it would be reasonable to get your sharded cluster infrastructure in place so you can minimise disruptions in future. For example, moving from a replica set to a sharded cluster deployment involves setting up config servers,
mongos
instances, and changing your connection seed list from replica set members tomongos
instances.Is the collection that you must shard stressing your current deployment? Before moving to a larger deployment with more moving parts I would encourage you to try to understand and tune your existing deployment. There may be some easy wins that could improve performance ... or conversely, unfortunate design choices that won't scale well in a sharded environment. Helpful starting points include the MongoDB Production Notes, Operations Checklist, and Security Checklist.
If you have good understanding or estimation of your data model and distribution patterns, I would suggest sharding in a test environment using representative data. There are many helpful tools for generating fake (but probabilistic) data if needed. For an example recipe, see: Stack Overflow: duplicate a collection into itself.
Assuming you are describing a sharded cluster with two shards, that is correct. Sharded collections will be distributed across shards while unsharded collections will live on a primary shard for each database.
mongos
provides access to all databases in the same sharded cluster.