When a MongoDB instance gets into a Rollback state, and the rollback data is greater than 300MB of data, you have to manually intervene. It will stay in a rollback state until you take action to save/remove/move that data, the (now secondary) should then be resynced to bring it back in line with the primary. This does not have to be a full resync, but that is the simplest way.
Multiple rollbacks are a symptom rather than the cause of a problem. Rollback only happens when a secondary that was not in sync (either due to lag or an issue with replication) becomes primary and takes writes. So, the problems that cause that to happen in the first place are what need to be taken care of - the rollback itself is something you need to deal with as an admin - there are too many potential pitfalls for MongoDB to reconcile the data automatically.
If you want to simulate this again for testing purposes, I have outlined how to do so here:
http://comerford.cc/2012/05/28/simulating-rollback-on-mongodb/
Eventually, this data will be stored in a collection (in the local DB) rather than dumped to disk, which will present opportunities to deal with it more effectively:
https://jira.mongodb.org/browse/SERVER-4375
At the moment though, once a rollback occurs, as you found, manual intervention is required.
Finally, the manual contains similar information to Kristina's blog now:
https://docs.mongodb.com/manual/core/replica-set-rollbacks
1 - You can have more than 1 mongos instance and connect to whichever you want (the client driver should have that option). Nevertheless, a mongos is just a router, meaning it will only route the requests to the correct shard(s).
2 - Yes, a config server can be in the same machine as a primary/secondary, just don't put config instances together (you are required to have 3 because of redundancy).
3 - A replica set is a group consisting of 1 primary and N secondaries. If you are sharding, each group will be a shard. That means that the single primary and all secondaries of the same group will have the same data (replicated). Also, having more secondaries won't help you increase the write performance as only the primary is able to do perform inserts/updates/deletes. Considering this, the only way to increase the write performance is by having more shards (horizontal scaling) and, of course, choosing a good sharding key that will balance data across all shards (you don't want all your data to be in 1 shard and all others remain empty). For further info check this official doc explaining the concept of replica set and its members http://docs.mongodb.org/manual/core/replica-set-members/
Also, instances of a replica set shard (the primary and N secondaries) are not required to be in the same machine. You might want to put them in separated machines to increase redundancy and perhaps have a better distribution of the load.
Best Answer
The easy, albeit a bit unsecure way
dbpath
This is a bit unsecure as it is unknown why the secondaries entered the Recovering state.
The more secure, but also more intrusive way
As above, but stop your application during the process. This prevents the possibility that your application is inserting more data than the secondaries are able to replicate. However, the problem may occur during production.
The most secure, but also most intrusive way
dbpath
on both secondariesdbpath
to both secondaries'dbpath
Some notes:
Use MMS. It's free, it's easy to set up and it gives you good information about your replica set. Try to keep the value for "replication lag" around 0, and take all means necessary that your replication lag is never greater than the "replication oplog window".
Always make sure you have a 1Gb network and a (sorry) shitload of RAM. The more, the better. Additional rule of thumb: rather half the RAM and SSDs than double the RAM and no SSDs (with RAM remaining within reasonable limits).
Disclaimer: Always make a backup of production data before fiddling with it.