I've had a project dumped on me and I'm trying to get up to speed withMongo
. But I have an immediate firefight, in that one shard in my Mongo cluster just keeps growing and growing, adding more files daily, and I can't find the cause. Our cluster is simple, 12 sharded
replica sets. "rs1"
is the primary shard, and it's the one that keeps gobbling up more and more disk space while the other 11 shards haven't grown at all. Our data is simple too, we just create 3 collections per day of our 3 document types. And we keep those for 90 days. So roughly at any time we have about 270 or so active collections, and nightly we drop the oldest collections as new collections are created for upcoming days.
The balancer
seems to be running fine, I can see the chunk counts getting balanced in the collections normally.
Is there some way I can STOP mongo from sending any more data to my problem shard while I get a handle on what's going on? I've already tried setting a maxSize, but that hasn't prevented it from still growing. Right now rs1 is taking up about 885 2gb disk files. (note that this is Mongo 2.6)
shards:
{ "_id" : "rs1", "host" : "rs1/psc1c250:39080,psc2c250:39080", "maxSize" : 1500000 }
{ "_id" : "rs10", "host" : "rs10/psc1c259:39080,psc2c259:39080" }
{ "_id" : "rs11", "host" : "rs11/psc1c260:39080,psc2c260:39080" }
{ "_id" : "rs12", "host" : "rs12/psc1c261:39080,psc2c261:39080" }
{ "_id" : "rs2", "host" : "rs2/psc1c251:39080,psc2c251:39080" }
{ "_id" : "rs3", "host" : "rs3/psc1c252:39080,psc2c252:39080" }
{ "_id" : "rs4", "host" : "rs4/psc1c253:39080,psc2c253:39080" }
{ "_id" : "rs5", "host" : "rs5/psc1c254:39080,psc2c254:39080" }
{ "_id" : "rs6", "host" : "rs6/psc1c255:39080,psc2c255:39080" }
{ "_id" : "rs7", "host" : "rs7/psc1c256:39080,psc2c256:39080" }
{ "_id" : "rs8", "host" : "rs8/psc1c257:39080,psc2c257:39080" }
{ "_id" : "rs9", "host" : "rs9/psc1c258:39080,psc2c258:39080" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "prodLN", "partitioned" : true, "primary" : "rs1" }
rs1:PRIMARY> db.stats(1024*1024*1024)
{
"db" : "prodLN",
"collections" : 319,
"objects" : 301960471,
"avgObjSize" : 3532.3332748146363,
"dataSize" : 993,
"storageSize" : 1626,
"numExtents" : 4436,
"indexes" : 634,
"indexSize" : 25,
"fileSize" : 1765,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 1820,
"totalSize" : 81
},
"ok" : 1
}
Best Answer
If you have one database with hundreds of collections (it's not clear that is the case because you have not included an
sh.status()
or similar but that seems to be what you are implying), then here is the likely scenario:prodLN
database hasrs1
as its primary shardrs1
What can you do? Well, you have a few options, but they require changes and/or downtime for the use of the current database:
movePrimary
on the current database, but you have to stop writing to any unsharded collections (read the behavior section in the link above). Depending on how much data there is in unsharded collections this could take some time.