We have a database with around 50 collections, and 70 million records. All records belong to specific customers and have a customerId
property. Currently all clients are in the USA. However, we're in the process of adding EU clients and would like to host their data in an EU datacenter.
Since all records have a customerId
(which is a string) and virtually all of our queries also specify this customerId
this makes it a good choice for a shard key for us – we're only going to have one shard per zone, and we'd like all of a customers data to be in that one shard.
My question is, given that the customerId
's are strings, how do we specify a minimum
and maximum
for the sh.updateZoneKeyRange()
function? Obiously minimum
will be the customerId
(e.g., "customer-somename") but how do we specify a maximum
of "customer-somename" + 1
? The issue here is that maximum
is exclusive, so it can't be the same as minimum
Best Answer
Assuming your
customerId
does not have any obvious mapping to location, it would be best to add alocation
field and create a compound shard key. Thelocation
field should be a prefix of the shard key, otherwise you will have an administrative headache trying to maintain zones for small ranges ofcustomerID
values (which could end up as granular as a single customer).With a compound shard key of
{location: 1, customerId:1}
your zone ranges would be straightforward to define.For example:
The MongoDB manual includes a similar example in Segmenting Data by Location and this is the same approach used for MongoDB Atlas' Global Clusters feature. Both of these examples suggest two-character country codes (ISO 3166-1 Alpha-2) for the
location
field, but you could choose any values which provide suitable granualarity for your future use cases. For example, ISO 3166-2 would provide for countries and subdivisions (administrative regions like state or province).Starting in MongoDB 4.2, you can also change a document's shard key value if you need to move existing user data between zones. You presumably would never want to change your
customerId
values, but could update alocation
field.