Setting Stack-Size ulimit for MongoDB to Reduce Connection Overhead

connectionsmongodb

I am using MongDB 2.4.9 and working on performance tunung. With reference to the ulimit settings for mongodb, in my linux environment (ubuntu 12.04) the default setting for ulimit -s is 8192. MongoDB does not specify any default value for ulimit -s (stack-size). However this case study suggests to set stack-size to 1024. How does the default stack-size 8291 impacts mongodb performance. Does changing this value to 1024 help to reduce connection per overhead in order to improve performance?

Best Answer

The default stack size for MongoDB is already 1024, not 8192 (it is set in the code, not as a system setting) and has been since version 1.8.3 (see SERVER-2707), so you are already seeing the benefits of a lower stack size.

Related Solutions

Mongodb – I found the normal ULimit setting for mongo db is as follows:

It's not a matter of the number of operations you will be doing, rather the number of connections and (possibly, though it would be unusual for this to play a big part) the number of server side javascript operations you plan to do (Map Reduce, mainly).

To explain: there will be a thread (and a file descriptor) for each connection made to/from the mongod process (similar for mongos) - therefore it is generally a good idea to have both values set beyond the hard coded 20,000 limit in MongoDB. You can see this if you run htop or something like this command while you spin up new connections to the mongod or mongos processes:

ps uH p <PID_OF_U_PROCESS> | wc -l

Most users will never get anywhere near these maximum levels, so this is merely a precaution on most systems to avoid problems with low ulimits. In a large cluster with many mongos processes you may see levels approaching this, but unless you are planning that level of deployment you will not have to worry.

For more information on the Map Reduce side of things, there is an excellent article on it, which includes thread use here.

MongoDB – how to reduce numYields

This is a common misconception, i.e. that yields are somehow causing the slowness. In fact they are a symptom, not a cause. Even if there is no lock contention that requires a yield (writes basically), the queries still yield when they have to page from disk. They then re-acquire the lock when a certain amount of paging is done and look to yield again if more paging is needed (repeat until complete). If there is no lock contention from writes, then this is all pretty much instantaneous and does not add to the overall execution time.

If a query yields a lot, then it was hitting disk a lot, and that is the cause of the slowness - the disk access. Hence, numYields is just a way to infer that it was indeed paging to disk that caused the query to be slow. If you want those queries to be fast, then you need to have that data set in memory, and have enough memory for it to stay there long terms and not be evicted.

Note: by default the kernel will use LRU to decide what gets evicted, so the likely candidates for slowness are queries on (large) parts of your data set that are not accessed very often.

There is no way to limit numYields, and it wouldn't really make sense to do so, but yes the remedy is to identify the data being addressed by those slow queries and make it fit into memory (note: the first query on any data will still be slow unless you pre-heat in some way, the second query will be in memory and fast).

Best Answer

Related Solutions

Mongodb – I found the normal ULimit setting for mongo db is as follows:

MongoDB – how to reduce numYields

Related Question