Compact will not recover space, because it really just defragments the data inside the existing data files, making the data more dense, more compact. In fact, a compact occasionally needs extra space to do its work and so can end up using up more space rather than reclaiming it.
Because a repair rewrites the data files from scratch, it reclaims the "wasted" space and recovers disk space.
Besides bringing the secondaries down and running a repair there as you mention, another option is to delete the data files from a secondary and re-sync them from the primary from scratch. This has a similar effect to a repair, because in both cases the data files are re-written, the difference being that you don't need ~2x the disk space because you wipe out the originals and pull the data from the primary.
Once you have done the first secondary, generally the quickest way to do the second would be to snapshot (with journaling enabled) or fsync and lock the secondary and just to a file based copy to seed the other secondary.
Your final option assuming you have run the repair already on the primary, which is really just a variation, would be to snapshot the primary (or fsync and lock) and use its repaired files as a seed to populate a secondary.
These methods are covered in more detail as part of the Backup docs:
http://www.mongodb.org/display/DOCS/Backups#Backups-Methods
As Antonis said, the number of connections has little relation to the number of databases.
In general, the number of connections aren't something to worry about. The MongoDB drivers keep the connections alive to reuse them and to prevent the overhead of setting up new connections.
However, each connection is provided with about 1MB of stack server side. Unnecessary connections might eat up precious RAM, which is used by MongoDB to for the indices and as much of the working set as possible in order to speed things up.
In case you have enough RAM on your server, you have nothing to worry about – just adjust your alert thresholds to more suitable numbers. If you have a lot of page faults, however, you should investigate a bit further.
Since you are using connection pooling, it is safe to assume that you either more concurrent connections on your application than your MongoDB server (hardware, that is) can handle, or you are opening unnecessary connections.
High number of concurrent connections
As a rule of thumb, your MongoDB should be able to handle as many connections as you have concurrent request. In order to give you a decent amount of time to scale out when you reach the server's limits, your alert should trigger at about 80% utilization. For example, let's assume your server can handle about 1500 connections easily, your alert go off at
1500 * 0.8 = 1200 connections
If your server gets into problems with the 1000 connections you mentioned or when you hit 80% utilization, you should first scale up , for example by putting more RAM into the machine or – more generally speaking – eliminate the limitation which prevents the server from handling this number of connections. Which point to scale up to is not easy to determine, but generally speaking, you want to scale up as long as you get more bang than you have to put bucks into it.
There is a point where the bang you get for each buck you put in decreases drastically – of course you want to stop scaling up a bit earlier. Now what can you do in case your server still does not meet your requirements? The answer is to scale out, which in MongoWorld means setting up a sharded cluster. A word of warning: While creating a sharded cluster is no rocket science, there are quite some caveats and pitfalls. Make sure you have read the documentation about sharding thoroughly before implementing a sharded cluster. A good consultant usually is worth the money, too.
That being said: Usually it is the application server which first reaches the limit of concurrent users it can handle, so have a close look there.
Multiple open connections per concurrent user
Usually, you request a connection from the pool by doing your stuff on the reused db
object and the connection is returned transparently to the pool after the stuff is done (simplified, but should be sufficient in this context). The pool is handled by the client transparently. Each time you use MongoClient.connect
, a connection pool is created. You should only call this method once per application and reuse it. Doublecheck if you follow the described pattern.
Conclusion
- Make sure you reuse the
db
object returned by MongoClient.connect
- Find out your number of concurrent users.
- Check wether the number of connections made to the replica set is much higher than the number of concurrent users. If not, everything is working as expected.
- If the numbers roughly match and you are experiencing problems from the database side (long response times, high latency), either scale up or out after you made sure that it is not your application slowing the responses down.
Best Answer
Found the answer:
In 3.2 WiredTiger is the default storage engine. So I had to remove existing databases (/var/lib/mongod). After starting mongod it will replicate from the primary member.
I accidently updeted directly to 3.2.9 and it worked. That was the reason that WiredTiger was default engine.