Mongodb – Is sharding effective for small collections

mongodb

It looks like database sharding is great if I have huge collections. What if I have lots of fairly sized collections ? Let's say that for 1 collection of 100 000 000 documents (not very big comments) sharding is effective. Is it also effective for 10 000 collections with 10 000 documents each ?

(I think this question is still valid for table oriented databases if you replace collections with tables and documents with rows. If possible I'd like to know the theoretical answer as well as the answer in the specific MongoDB scenario, if different from the theoretical answer.)

Best Answer

Is it also effective for 10 000 collections with 10 000 documents each ?

Most people have the "single large collection" problem and so the sharding is clearly useful for reducing headaches of balancing this data.

However, when you have 10 000 small collections, your headache is probably not "balancing the data". With this many small collections your problem is likely about tracking these collections. Depending on your document size, you may not even break the lower limit for sharding to actually happen.

For the really small collections, you can use the little-known movePrimary command to manage the location of your data.

Of course, the other way to look at this is why do you have 10k collections? A collection doesn't need homogeneous objects and with 10k collections most of them have to be generated. It's quite possible to store different "types" of data in the same collection, reduce the number of collections and then include the type as part of the shard key.