Mongodb – mongo crashing due to too many open files

mongodbmongodb-3.4Ubuntu

I recently upgraded an Ubuntu 14 server, running mongo 3.4.11, to Ubuntu 16. I re-installed the identical version of mongodb-org from their PPA, but now when I start mongo, it's doesn't respond to any connections, and I see this error in /var/log/mongodb/mongodb.log:

2018-02-06T18:40:15.680+0000 I CONTROL  [main] ***** SERVER RESTARTED *****
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] MongoDB starting : pid=15925 port=27017 dbpath=/var/lib/mongodb 64-bit host=proddb1
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] db version v3.4.11
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] git version: 34f5bec2c9d827d71828fe858167f89a28b29a2a
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] modules: none
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] build environment:
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten]     distarch: x86_64
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2018-02-06T18:40:15.684+0000 I CONTROL  [initandlisten] options: { config: "/etc/mongodb.conf", net: { bindIp: "127.0.0.1" }, storage: { dbPath: "/var/lib/mongodb", engine: "wiredTiger", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongodb.log" } }
2018-02-06T18:40:15.684+0000 W -        [initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.
2018-02-06T18:40:15.709+0000 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2018-02-06T18:40:15.709+0000 I STORAGE  [initandlisten] 
2018-02-06T18:40:15.709+0000 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2018-02-06T18:40:15.709+0000 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2018-02-06T18:40:15.709+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=15574M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),verbose=(recovery_progress),
2018-02-06T18:40:16.468+0000 I STORAGE  [initandlisten] WiredTiger message [1517942416:468088][15925:0x7f4d31632d00], txn-recover: Main recovery loop: starting at 12168/128
2018-02-06T18:40:16.468+0000 I STORAGE  [initandlisten] WiredTiger message [1517942416:468689][15925:0x7f4d31632d00], txn-recover: Recovering log 12168 through 12169
2018-02-06T18:40:16.528+0000 I STORAGE  [initandlisten] WiredTiger message [1517942416:528271][15925:0x7f4d31632d00], txn-recover: Recovering log 12169 through 12169
2018-02-06T18:40:17.875+0000 E STORAGE  [initandlisten] WiredTiger error (24) [1517942417:875171][15925:0x7f4d31632d00], file:collection-43442-4253276309270751377.wt, WT_SESSION.open_cursor: /var/lib/mongodb/collection-43442-4253276309270751377.wt: handle-open: open: Too many open files
2018-02-06T18:40:17.875+0000 I -        [initandlisten] Invariant failure: ret resulted in status UnknownError: 24: Too many open files at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 95
2018-02-06T18:40:17.875+0000 I -        [initandlisten] 

***aborting after invariant() failure

I've seen a few similar questions, like this one and this one, but none of the solutions have any effect for me.

I tried creating sudo nano /lib/systemd/system/mongodb.service and adding:

[Service]
# (file size)
LimitFSIZE=infinity
# (cpu time)
LimitCPU=infinity
# (virtual memory size)
LimitAS=infinity
# (open files)
LimitNOFILE=990000
# (processes/threads)
LimitNPROC=495000

but after rebooting, I received the same error.

How do I fix this?

The real odd thing is that I've been developing with a mirror of this database on a separate Ubuntu 16 server with mongo 3.4.10…and never ran into this issue. My /etc/mongod.conf is even the same on both servers, yet I've never had this error on my dev machine. I'm still somewhat new to mongo administration. Is it common to have such major incompatibilities between minor mongo releases like this?

Best Answer

I have gone through your error and find out that in your case the Open file limit is 990000. Which is sufficient to open the mongos and mongod process. As per MongoDB BOL Here The default net.maxIncomingConnections should be Default: 65536 and by default LimitNOFILE should be Here LimitNOFILE=64000.

The maximum number of simultaneous connections that mongos or mongod will accept. This setting has no effect if it is higher than your operating system’s configured maximum connection tracking threshold.

Do not assign too low of a value to this option, or you will encounter errors during normal application operation.

This is particularly useful for a mongos if you have a client that creates multiple connections and allows them to timeout rather than closing them.

In this case, set maxIncomingConnections to a value slightly higher than the maximum number of connections that the client creates, or the maximum size of the connection pool.

As per Ramon Fernandez from MongoDB jira Blog Here WiredTiger needs at least two files per collection (one for the collection data and one for the _id index), plus one file per additional index in a collection. If your total count of collections and indexes is large you'll need to adjust your open files limit accordingly.

As per MongoDB Recommended ulimit Settings for Linux distributions using systemd is

[Service]
# Other directives omitted
# (file size)
LimitFSIZE=infinity
# (cpu time)
LimitCPU=infinity
# (virtual memory size)
LimitAS=infinity
# (open files)
LimitNOFILE=64000
# (processes/threads)
LimitNPROC=64000

As in your case it's 990000, which should be sufficient for most deployments. Please make sure that you set a high value for this limit at the system level.

For further your ref Here and Here and Resource limit directives, their equivalent ulimit shell commands and the unit used