MongoDB MMS – Replica Set Fails to Restart

mongo-repairmongodb

I have a MMS replica set deployed with 3 instances. It was working fine until I shut it down this morning in order to do some changes and restarted it. From logs, it feels that there was an unclean shutdown of replica set. The mongod server on primary is failing to start now.

Here are the full logs

2015-08-20T07:08:41.389+0000 W -        [initandlisten] Detected unclean shutdown - /data/XXXXXXXXXXXX/mongod.lock is not empty.
2015-08-20T07:08:41.406+0000 I JOURNAL  [initandlisten] journal dir=/data/XXXXXXXXXXXXX/journal
2015-08-20T07:08:41.406+0000 I JOURNAL  [initandlisten] recover begin
2015-08-20T07:08:41.406+0000 I JOURNAL  [initandlisten] info no lsn file in journal/ directory
2015-08-20T07:08:41.406+0000 I JOURNAL  [initandlisten] recover lsn: 0
2015-08-20T07:08:41.406+0000 I JOURNAL  [initandlisten] recover /data/XXXXXXXXXXXX/journal/j._0
2015-08-20T07:08:41.407+0000 I JOURNAL  [initandlisten] recover cleaning up
2015-08-20T07:08:41.407+0000 I JOURNAL  [initandlisten] removeJournalFiles
2015-08-20T07:08:41.641+0000 I JOURNAL  [initandlisten] recover done
2015-08-20T07:08:41.641+0000 I JOURNAL  [initandlisten] preallocating a journal file /data/XXXXXXXXXXX/journal/prealloc.0
2015-08-20T07:08:44.074+0000 I -        [initandlisten]   File Preallocator Progress: 744488960/1073741824 69%
2015-08-20T07:08:47.176+0000 I -        [initandlisten]   File Preallocator Progress: 901775360/1073741824 83%
2015-08-20T07:08:50.274+0000 I -        [initandlisten]   File Preallocator Progress: 1027604480/1073741824 95%
2015-08-20T07:09:09.057+0000 I JOURNAL  [durability] Durability thread started
2015-08-20T07:09:09.057+0000 I JOURNAL  [journal writer] Journal writer thread started
2015-08-20T07:09:09.060+0000 I CONTROL  [initandlisten] MongoDB starting : pid=25507 port=27000 dbpath=/data/XXXXXXXXXXXXX 64-bit host=CH$
2015-08-20T07:09:09.060+0000 I CONTROL  [initandlisten] db version v3.0.2
2015-08-20T07:09:09.060+0000 I CONTROL  [initandlisten] git version: 6201872043ecbbc0a4cc169b5482dcf385fc464f
2015-08-20T07:09:09.060+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014
2015-08-20T07:09:09.060+0000 I CONTROL  [initandlisten] build info: Linux ip-10-229-1-2 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 U$
2015-08-20T07:09:09.060+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2015-08-20T07:09:09.060+0000 I CONTROL  [initandlisten] options: { config: "/data/XXXXXXXXXXX/automation-mongod.conf", net: { port: 270$
2015-08-20T07:09:09.074+0000 I -        [initandlisten] Invariant failure _name == nsToDatabaseSubstring( ns ) src/mongo/db/catalog/database.c$
2015-08-20T07:09:09.091+0000 I CONTROL  [initandlisten]
 0xf4f859 0xef0031 0xed4b52 0x91e106 0x91e18f 0x920033 0x922cb0 0x808701 0x7d4ba4 0x7f0503489ec5 0x805d17
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B4F859"},{"b":"400000","o":"AF0031"},{"b":"400000","o":"AD4B52"},{"b":"400000","o":"51E106"},{"b":"400000","o$
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf4f859]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef0031]
 mongod(_ZN5mongo15invariantFailedEPKcS1_j+0xB2) [0xed4b52]
 mongod(_ZNK5mongo8Database13getCollectionERKNS_10StringDataE+0x36) [0x91e106]
 mongod(_ZN5mongo8Database30_getOrCreateCollectionInstanceEPNS_16OperationContextERKNS_10StringDataE+0x1F) [0x91e18f]
 mongod(_ZN5mongo8DatabaseC1EPNS_16OperationContextERKNS_10StringDataEPNS_20DatabaseCatalogEntryE+0x1E3) [0x920033]
 mongod(_ZN5mongo14DatabaseHolder6openDbEPNS_16OperationContextERKNS_10StringDataEPb+0x150) [0x922cb0]
mongod(_ZN5mongo14DatabaseHolder6openDbEPNS_16OperationContextERKNS_10StringDataEPb+0x150) [0x922cb0]
 mongod(_ZN5mongo13initAndListenEi+0xC01) [0x808701]
 mongod(main+0x134) [0x7d4ba4]
 libc.so.6(__libc_start_main+0xF5) [0x7f0503489ec5]
 mongod(+0x405D17) [0x805d17]
-----  END BACKTRACE  -----
2015-08-20T07:09:09.091+0000 I -        [initandlisten]

***aborting after invariant() failure

Any idea how can I fix it? I have been trying to fix it since past 4 hours but nothing seems to be working.

Best Answer

So, it turned out that issue was in .ns file of one of the database. I had to delete that file and restart the server. Server started successfully however, the database whose .ns I deleted was lost.

For newbies in mongoDB .ns file is the namespace file that mongo creates for each database in its data (/data) directory.