Mongodb – Heavy load of one of shards in MongoDB

mongodbperformancesharding

I have a problem with one of our shards. For some reason I'm getting the next usage:

15:38:42 up 4 days, 10:28, 1 user, load average: 3.67, 3.50, 3.56

In the mean time all other shards are ok. Their average load approx:

16:49:58 up 11 days, 56 min, 1 user, load average: 0.03, 0.03, 0.05

I have 3config, 2 query and 7 shards (all of them are primary replicaset. don't ask why).

I have 2 databases: 20Gb and 30gb. All servers are the same in their configuration. Also, in each database only 1 collection. In MMS control panel I see:

Write Lock percentage

Maybe a problem here?
How I can investigate this issue and fix it?

As I'm new with MongoDB I have used the next article to configure my shards:

https://stackoverflow.com/questions/19672302/how-to-programatically-pre-split-a-guid-based-shard-key-with-mongodb

Please check 1st accepted answer. Or how can I sheck my shard key myself?

Best Answer

The problem explained

As per your comment, your shard key is the _id field of the document. This field is monotonically increasing, basically like an incremented integer.

Put simply, sharding works this way: documents are stored in chunks. Those chunks are spread over the cluster based on ranges of the shard key. Let's look at a simple example:

s1: chunks holding id 0-50
s2: chunks holding id 51-100
s3: chunks above 101-?

And here is where the problem occurs: there is one shard were all new documents go to. What happens then is that new chunks are created all the time (as chunks have fixed size). So not only are all writes focussed on one server, this server also has to do some overhead work. After a certain threshold is met, the cluster will start to move chunks to the other shards, which only will balance out disk space, though. In out example, it might look like this:

s1: chunks holding id 0-100
s2: chunks holding id 101-200
s3: chunks holding id 201-?

Obviously, all write operations will still go to s3.

What went wrong?

You chose the wrong shard key. Monotonically increasing shard keys lead to the problem explained above. Altough ObjectIds look like hash sums, but they aren't. They are monotonically increasing.

What can be done?

You need a better shard key. Since a shard key can not be changed after a collection is sharded, migrating to a new shard key is not a simple thing to do. However, there is a quite detailed explanation in mongoDB's ticket system. Basically, it works like this:

Remove all but one shard from the cluster
Wait for the chunk migrations to be finished
Shut down the remaining shard
Change the shard key through a mongos instance
Restart the last shard
Add the other shards to the cluster

Best Answer

Related Solutions

Mongodb sharded cluster backup user issue

Mongodb – Must data be evenly distributed among Mongo shards

Does sharding a collection with 10 documents (give and take) make sense?

Do I need to have my data balanced?

Conclusion

Related Question