I've decided to shard a collection with 500M+ documents into 24 shards. All shards are on the same machine without replicas as was advised in this title (core sharding). The shard key is {"_id":1}. I encountered, that chunk migration is extremely slow. It have been processing 2 days and progress is about 30%. I tried to move chunks manually. I stopped the balancer and executed commands from this title But it seems to me that nothing happened. I started the balancer again and it continue to migrate chunks in the same way. Is there any method to make it faster?
Mongodb: slow shard balancing
load balancingmongodbsharding
Related Solutions
According to what you found out yourself, I think I can provide a detailed explanation.
How chunk migration works (simplified)
Here is how chunk migration works (simplified for the sake of shortness):
- When chunk exceeds the configured chunkSize (64MB by default), the mongos which caused the size increase will split the chunk.
- That chunk is split on the shard the original chunk resided on. Since the config servers do not form a replica set, it has to update all three config servers.
- If the difference in the number of chunks reaches the chunk migration threshold, the cluster balancer will initiate the migration process.
- The balancer will try to acquire a balancing lock. Note that because of this there can be only one chunk migration at any given point in time. (This will become important in the explanation what happened.)
- In order to acquire the balancing lock, the balancer needs to update the config on all three config servers.
- If the balancing lock is acquired, the data will be copied over to the shard with the least chunks.
- After the data is successfully copied to the destination, the balancer updates the key range to shard mapping on all three config servers
- As a last step, the balancer notifies the source shard to delete the migrated chunk.
What happened in your case
Your data was increasing to a point where chunks had to be split. At this point, all three of your config servers were available and the metadata was updated accordingly. Chunks splitting is a relatively cheap operation in comparison to chunk migration and may happen often, so under load, you usually have a lot more chunk splits than migrations. As said before, only one chunk migration can happen at any given point in time.
Due to whatever problems, one or more of your config servers became unreachable after the chunks were split, but before enough chunks were migrated to balance them out below the chunk migration threshold, since there can only be one running chunk migration at any given point in time. Bottom line: Some time before every chunk that needed to be migrated was actually migrated, one or more of your config servers became unavailable.
Now the balancer wanted to migrate a chunk, but could not reach all config servers to acquire the global migration lock. Hence the error message.
How to deal with such situation
It is very likely that your config server were out of sync. In order to deal with this situation, please read Adam Comerfords answer to mongodb config servers not in sync. Follow it to the letter.
How to prevent
Plain and simple: use MMS. It is free, gives a lot of information about health and performance and using the automation agent, administration of a MongoDB cluster can be done a lot faster.
I tend to suggest to install at least 3 monitoring agents, so you can have a scheduled downtime on one while maintaining redundancy with the others.
MMS has alerting capabilities, so you will be notified if one of your config servers becomes unavailable, which is a serious situation.
The definition of the steps referenced in a chunk migration are a little fluid, and can depend on the version you are running. However, based on what you have provided I suspect that the primary on rs01
and the number of chunks you are trying to move is the source of the issue here.
Last time I looked at this in detail (around version 2.6), step 2 was basically a "sanity check" on the source shard primary to make sure it was ready to kick off another migration. There are two common reasons (and some not so common) for the primary to get "stuck" in step 2:
- The primary is "too busy" - i.e. it is processing a lot of deletes from previous migrations and does not want to take on more migrations until some of those finish
- There is an error being hit (chunk too big for example) and the migration is being aborted
Given that your two existing shards were not yet balanced when you added a third shard, though rs03
will now be the new preferred destination for migrating chunks since it has the lowest total, adding a third shard will not make this happen any more quickly. In fact it just means there are more chunks to move than before you added the third shard.
To confirm what the root cause is here, you will need to take a look at the logs on the primary of shard rs01
. The shard with the most chunks will always be chosen as the source of a migration, and the primary will be the one doing all the work. The logs should tell you why the migrations are being aborted or taking a long time.
If it is a case of being too busy, you do have an option that will free it up, temporarily at least (it may get stuck again later). You can step down the primary, thereby killing any background delete jobs it is running for cleanup.
However, this will mean leaving behind "orphaned chunks" and you will need the clean up command to fix that later - you are not solving anything per se, you are just putting the work off until later. There are also only limited options to tweak this behavior, for more see here.
There are other options in terms of balancing chunks more quickly (mitosis), but they are way beyond the scope of an answer on this site I'm afraid.
If there is an error being hit, and the cleanup is not the issue, then it would probably be best to get the primary log messages and post a separate question to figure out the cause.
Related Question
- MongoDB sharded cluster chunks distribution
- Mongodb – How to speed up MongoDB chunk moving between shards
- Mongodb – Why findOne() hangs on a sharded collection
- Mongodb shard balancing failed
- Mongodb – Query sharded collection returns all the results from the primary shard AND from the other shards
- Mongodb removeShard issue with config.system.session
Best Answer
Math first: you moved 150M documents in 2 days, which is roughly 860 documents per second including metadata and indices, where reading and writing all occurs on the same machine. That is not what I would call slow. The description coming to my mind is "lightning fast". ;)
Since there is no real distribution of the write load, an easy way to speed things up is to add two or more machines.
A few notes: sharding production data on non-replica shards is dangerous, to say the least. If one of the shards fails, the data contained is permanently unavailable until you get the shard up and running again. Plus, since there was no server to write to for a specific key range, values of that range can not be written. If the shard was a replica set, the failure would lead to the election of a new primary, to which all writes and (depending on the configuration) most or even all reads would go.
_id
can be used as a shard key, however it should be hashed.