I have a sharded collection where the shard key is a field called "uuid". This field's value is of type string and represents hexadecimal values i.e a hexadecimal string. For each document this "uuid" field is unique.
The data is divided into chunks automatically by MongoDB.
I cannot figure out how MongoDB is dividing this hexadecimal string into contiguous ranges. There are no documents that explain how Mongo forms these ranges
Can you please help me to understand how these ranges are formed?
For a sample, I have inserted 3025357 documents with the said hexadecimal values. The chunks and the ranges associated with them are,
{
"_id" : "database.sha_shard-uuid_MinKey",
"lastmod" : Timestamp(2, 0),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : { "$minKey" : 1 }
},
"max" : {
"uuid" : "000043c071f23fc889275f77f950c649faac92e0"
},
"shard" : "shardRpSet2",
"history" : [
{
"validAfter" : Timestamp(1577632842, 37),
"shard" : "shardRpSet2"
}
]
},{
"_id" : "database.sha_shard-uuid_\"5b935a89d91977490d04f740a86bccc2b3cc2bfb\"",
"lastmod" : Timestamp(3, 5),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "5b935a89d91977490d04f740a86bccc2b3cc2bfb"
},
"max" : {
"uuid" : "7a25fa7aa3a86ed259f646d7890db370e8b43ae7"
},
"shard" : "shardRpSet1",
"history" : [
{
"validAfter" : Timestamp(1577632856, 21509),
"shard" : "shardRpSet1"
}
]
},{
"_id" : "database.sha_shard-uuid_\"7a25fa7aa3a86ed259f646d7890db370e8b43ae7\"",
"lastmod" : Timestamp(3, 6),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "7a25fa7aa3a86ed259f646d7890db370e8b43ae7"
},
"max" : {
"uuid" : "810b573464d4894fc40b428ec82ec54d9a681bf6"
},
"shard" : "shardRpSet1",
"history" : [
{
"validAfter" : Timestamp(1577632856, 21509),
"shard" : "shardRpSet1"
}
]
},{
"_id" : "database.sha_shard-uuid_\"000043c071f23fc889275f77f950c649faac92e0\"",
"lastmod" : Timestamp(4, 0),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "000043c071f23fc889275f77f950c649faac92e0"
},
"max" : {
"uuid" : "1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7"
},
"shard" : "shardRpSet2",
"history" : [
{
"validAfter" : Timestamp(1577635896, 15268),
"shard" : "shardRpSet2"
}
]
},{
"_id" : "database.sha_shard-uuid_\"1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7\"",
"lastmod" : Timestamp(5, 0),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7"
},
"max" : {
"uuid" : "3d165990d2969bbaf79b6b0d790080b46ca5f056"
},
"shard" : "shardRpSet",
"history" : [
{
"validAfter" : Timestamp(1577635906, 26457),
"shard" : "shardRpSet"
}
]
},{
"_id" : "database.sha_shard-uuid_\"3d165990d2969bbaf79b6b0d790080b46ca5f056\"",
"lastmod" : Timestamp(5, 1),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "3d165990d2969bbaf79b6b0d790080b46ca5f056"
},
"max" : {
"uuid" : "5b935a89d91977490d04f740a86bccc2b3cc2bfb"
},
"shard" : "shardRpSet1",
"history" : [
{
"validAfter" : Timestamp(1577632856, 21509),
"shard" : "shardRpSet1"
}
]
},{
"_id" : "database.sha_shard-uuid_\"c1788722a31a5a5a5caa00816ad85aeeda26e581\"",
"lastmod" : Timestamp(5, 2),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "c1788722a31a5a5a5caa00816ad85aeeda26e581"
},
"max" : {
"uuid" : "dcbd245e03d425aa14a85b51befde274856fc5f3"
},
"shard" : "shardRpSet",
"history" : [
{
"validAfter" : Timestamp(1577630416, 3),
"shard" : "shardRpSet"
}
]
},{
"_id" : "database.sha_shard-uuid_\"dcbd245e03d425aa14a85b51befde274856fc5f3\"",
"lastmod" : Timestamp(5, 3),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "dcbd245e03d425aa14a85b51befde274856fc5f3"
},
"max" : {
"uuid" : "fffff8c5e160711fb48f0d38ce01a98880e869e2"
},
"shard" : "shardRpSet",
"history" : [
{
"validAfter" : Timestamp(1577630416, 3),
"shard" : "shardRpSet"
}
]
},{
"_id" : "database.sha_shard-uuid_\"fffff8c5e160711fb48f0d38ce01a98880e869e2\"",
"lastmod" : Timestamp(6, 0),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "fffff8c5e160711fb48f0d38ce01a98880e869e2"
},
"max" : {
"uuid" : { "$maxKey" : 1 }
},
"shard" : "shardRpSet2",
"history" : [
{
"validAfter" : Timestamp(1577636268, 67),
"shard" : "shardRpSet2"
}
]
},{
"_id" : "database.sha_shard-uuid_\"810b573464d4894fc40b428ec82ec54d9a681bf6\"",
"lastmod" : Timestamp(6, 1),
"lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
"ns" : "database.sha_shard",
"min" : {
"uuid" : "810b573464d4894fc40b428ec82ec54d9a681bf6"
},
"max" : {
"uuid" : "c1788722a31a5a5a5caa00816ad85aeeda26e581"
},
"shard" : "shardRpSet",
"history" : [
{
"validAfter" : Timestamp(1577630416, 3),
"shard" : "shardRpSet"
}
]
}
Best Answer
Reference on how shard chunks work: https://docs.mongodb.com/v4.0/core/sharding-data-partitioning/
Now, to understand what records would be going inside a chunk, we need to understand the section "Each chunk has a inclusive lower and exclusive upper range based on the shard key", and from now on, we should call this the chunk range.
For example, this chunk:
The fields
min
andmax
are the chunk range:This range defines what goes inside the chunk, you can understand how the range works reading the BSON reference: https://docs.mongodb.com/v4.0/reference/bson-type-comparison-order/
In your case, if the
UUID
field only contains strings, this is how the record will be evaluated as being inside the chunk range: