I can't disable the write concern while i mongorestore with .bson file. It totally damages the insertion rate (insert/sec).
This is my sharding status.
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("546bd498f0531e27d40544b2")
} shards: { "_id" : "shard2", "host" : "shard2/hadoop5.xx.com:27018" } databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "deneme6", "partitioned" : false, "primary" : "shard2" } { "_id" : "seref", "partitioned" : true, "primary" : "shard2" }
This is my shard. I already changed the property of getLastErrorDefaults w:0 for disable write concern.
shard2:PRIMARY> show tables; me oplog.rs startup_log system.indexes system.replset shard2:PRIMARY> db.me.find() { "_id" : ObjectId("546bd2869e14c01f13708886"), "host" : "hadoop5.xx.com" }
shard2:PRIMARY> db.system.replset.find() { "_id" : "shard2", "version" : 2, "members" : [ { "_id" : 0, "host" : "hadoop5.xx.com:27018" } ], "settings" : { "getLastErrorDefaults" : { "w" : 0 } } }
shard2:PRIMARY>
This is my mongorestore command which load data via query router ( mongos )
mongorestore --db seref --collection can11 --noobjcheck --w 0 --noIndexRestore --host 10.21.12.21 --port 27017 /tmp/iabc_kwd.bson
When i try to mongorestore on sharded cluster via mongos the insertion rate is 300k byte / sec , when i try to mongorestore on single mongodb the insertion rate is 12m /sec.
How can i speed up the insertion rate of restoring mongodb sharded cluster?
-- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("546ca29960e36f8983130015")
}
shards:
{ "_id" : "shard0000", "host" : "hadoop5:27018" }
{ "_id" : "shard0001", "host" : "hadoop6:27017" }
{ "_id" : "shard0002", "host" : "hadoop4:27017" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "seref", "partitioned" : true, "primary" : "shard0000" }
{ "_id" : "abc", "partitioned" : false, "primary" : "shard0001" }
{ "_id" : "mahmut", "partitioned" : true, "primary" : "shard0002" }
{ "_id" : "incee", "partitioned" : true, "primary" : "shard0002" }
incee.onur
shard key: { "_id" : "hashed" }
chunks:
shard0000 1
shard0002 501
too many chunks to print, use verbose if you want to force print
All shards computer have 8 GB RAM , and they have 1.3 TB HDD.
Best Answer
The problem (from what I can see)
From what I can see from your output, you have a monotonically increasing shard key like
ObjectId
or aDate
or something similar.MongoDB sharding is done with key ranges over the selected shard key. Put simply, it works like "
shard0000
is supposed to hold the rang from-infinity to $i
,shard0001
the keys from$h to $p
andshard0002
from$q to infinity
. With a monotonically increasing shard key, every new value will be bigger than the last one inserted, heading to infinity, so the shard which will be used is alwaysshard0002
, instead of the writes being balanced to all shards.So all of the write operations seem to hit
shard0002
with individual chunks being migrated to the other shards. The necessary chunk migrations together with added network latency and a lot of metadata updates going on, it may well be that you have a poor write performance.The solution
Really bad news: you can't change the shard key once it is set. You basically have to create new sharded collection with a better shard key. For choosing a proper shard key, please read Considerations for Selecting Shard Keys in the MongoDB docs.
P.S. I wasn't sure wether to answer this question, since it is slightly off topic on Stackoverflow, which is aimed at programming questions. But since usually the developers are supposed to choose the shard key, my intention was to clarify things for them. Additionally, the data modeling might be influenced by sharding considerations, so – by a small margin – I thought it was ok to answer here. However, dear reviewer, feel free to flag this answer for deletion if you think differently. I am really not sure.