Reproducing the situation :
- mc1 collection already created on shard0000 before sharding
- mc2 collection already created on shard0001 before sharding
Create a test-shard:
$ mongo --nodb
mongo> config={d0:{smallfiles:"",noprealloc:"",nopreallocj : ""}, d1 : {smallfiles : "", noprealloc : "", nopreallocj : "" }};
mongo> cluster = new SharingTest({shards:config});
In another terminal (shard0000 on port 30000, shard0001 on port 30001, mongos on port 30999) :
$ ps aux | grep mongo
root 9668 0.0 0.7 655804 23484 pts/1 Sl+ 15:57 0:00 mongo --nodb
root 9693 0.4 2.4 344400 74108 pts/1 Sl+ 15:59 0:00 mongod --port 30000 --dbpath /data/db/test0 --smallfiles --noprealloc --nopreallocj --setParameter enableTestCommands=1
root 9712 0.3 2.1 271672 62828 pts/1 Sl+ 15:59 0:00 mongod --port 30001 --dbpath /data/db/test1 --smallfiles --noprealloc --nopreallocj --setParameter enableTestCommands=1
root 9732 0.2 0.5 117672 16200 pts/1 Sl+ 15:59 0:01 mongos --port 30999 --configdb localhost:30000 --chunkSize 50 --setParameter enableTestCommands=1
Create mc1 on shard0000 and mc2 on shard0001 with some data:
$ mongo --port 30000
> use cluster
> for(var i=0; i<=1000; i++) {db.mc1.insert({'a':i})}
> db.mc1.createIndex({a:1})
$ mongo --port 30001
> use cluster
> for(var i=0; i<=1000; i++) {db.mc2.insert({'a':i})}
> db.mc2.createIndex({a:1})
Connect to mongos, and shard cluster db, mc1 collection and mc2 collection:
$ mongo --port 30999
mongos> show dbs
admin (empty)
cluster 0.063GB
config 0.016GB
mongos> use cluster
switched to db cluster
mongos> show collections
mc2 #only mc2 is shown because see in sh.status => primary : shard0001
system.indexes
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("55dacecf2c2855020c52454e")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
balancer:
Currently enabled: no
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "cluster", "partitioned" : false, "primary" : "shard0001" }
mongos> sh.enableSharding("cluster")
{ "ok" : 1 }
mongos> sh.shardCollection("cluster.mc1", {'a':1})
{ "collectionsharded" : "cluster.mc1", "ok" : 1 }
mongos> sh.shardCollection("cluster.mc2", {'a':1})
{ "collectionsharded" : "cluster.mc2", "ok" : 1 }
Check amount of documents in the shard-distribution:
mongos> db.mc1.getShardDistribution()
Shard shard0001 at localhost:30001
data : 0B docs : 0 chunks : 1
estimated data per chunk : 0B
estimated docs per chunk : 0
Totals
data : 0B docs : 0 chunks : 1
Shard shard0001 contains NaN% data, NaN% docs in cluster, avg obj size on shard : NaNGiB
mongos> db.mc2.getShardDistribution()
Shard shard0001 at localhost:30001
data : 46KiB docs : 1001 chunks : 1
estimated data per chunk : 46KiB
estimated docs per chunk : 1001
Totals
data : 46KiB docs : 1001 chunks : 1
Shard shard0001 contains 100% data, 100% docs in cluster, avg obj size on shard : 48B
Problem : No documents in mc1. See shard.status() => primary is pointing to Shard001. So only the collections in cluster-database on Shard0001 got sharded.
Solution 1:
- Dump mc1 collection from shard0000
- Drop mc1 collection from shard0000
- Restore mc1 collection into mongos
(!!!! make sure nobody can add more records !!!!)
$ mkdir mc1
$ mongodump --port 30000 --db=cluster --collection=mc1 --out mc1/
$ mongo --port 30000 --eval "db.mc1.drop()" cluster
$ mongorestore --port 30999 --db=cluster --collection=mc1 mc1/cluster/mc1.bson
$ mongo --port 30999
MongoDB shell version: 3.0.5
connecting to: 127.0.0.1:30999/test
Server has startup warnings:
2015-08-24T15:59:11.178+0800 I CONTROL ** WARNING: You are running this process as the root user, which is not recommended.
2015-08-24T15:59:11.178+0800 I CONTROL
mongos> use cluster
switched to db cluster
mongos> show collections
mc1
mc2
system.indexes
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("55dacecf2c2855020c52454e")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
balancer:
Currently enabled: no
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "cluster", "partitioned" : true, "primary" : "shard0001" }
cluster.mc1
shard key: { "a" : 1 }
chunks:
shard0001 4
{ "a" : { "$minKey" : 1 } } -->> { "a" : 1 } on : shard0001 Timestamp(1, 1)
{ "a" : 1 } -->> { "a" : 651 } on : shard0001 Timestamp(1, 2)
{ "a" : 651 } -->> { "a" : 977 } on : shard0001 Timestamp(1, 3)
{ "a" : 977 } -->> { "a" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 4)
cluster.mc2
shard key: { "a" : 1 }
chunks:
shard0001 1
{ "a" : { "$minKey" : 1 } } -->> { "a" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 0)
Solution 2 (without having to dump):
(Dropping a collection or a database will remove it from the cluster configuration automatically)
Drop 'empty'-mc1 collection from cluster (test first if empty):
mongos> use cluster
switched to db cluster
mongos> db.mc1.find().count()
0
mongos> db.mc1.drop();
true
Move primary to shard0000:
mongos> use admin
switched to db admin
mongos> db.runCommand({movePrimary:"cluster", to:"shard0000"});
{ "primary " : "shard0000:localhost:30000", "ok" : 1 }
Checking the status:
mongos> use cluster
switched to db cluster
mongos> sh.status();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("55dacecf2c2855020c52454e")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
balancer:
Currently enabled: no
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "cluster", "partitioned" : true, "primary" : "shard0000" }
cluster.mc2
shard key: { "a" : 1 }
chunks:
shard0001 1
{ "a" : { "$minKey" : 1 } } -->> { "a" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 0)
mongos> show collections
mc1
system.indexes
mongos> db.mc1.find().count()
1001
mongos> db.mc2.find().count()
1001
Shard mc1 now into the collection:
mongos> sh.shardCollection("cluster.mc1", {'a':1});
{ "collectionsharded" : "cluster.mc1", "ok" : 1 }
mongos> db.mc1.getShardDistribution()
Shard shard0000 at localhost:30000
data : 46KiB docs : 1001 chunks : 1
estimated data per chunk : 46KiB
estimated docs per chunk : 1001
Totals
data : 46KiB docs : 1001 chunks : 1
Shard shard0000 contains 100% data, 100% docs in cluster, avg obj size on shard : 48B
mongos> db.mc2.getShardDistribution()
Shard shard0001 at localhost:30001
data : 46KiB docs : 1001 chunks : 1
estimated data per chunk : 46KiB
estimated docs per chunk : 1001
Totals
data : 46KiB docs : 1001 chunks : 1
Shard shard0001 contains 100% data, 100% docs in cluster, avg obj size on shard : 48B
Joy joy joy, that added it!!!
So, in your case (connect to mongos):
> use cluster
> db.fs.chunks.find().count() #check if 0
> db.fs.chunks.drop()
> use admin
> db.runCommand({movePrimary:"cluster", to:"shard0001"});
> use cluster
> sh.shardCollection("cluster.fs.chunks", {'files_id':1,'n':1});
Best Answer
Schema Validation
You can specify Schema Validation on the document's within a collection to validate the kind of data types you can insert or update for each field. In addition, you can also set other constraints like a required field. This feature will allow you maintain a specific data type for a field. Note that Schema Validation is optional.
Field Types
By default, MongoDB number data types are of type
double
. There are other number types likeint
,long
,float
andNumberDecimal
. You can convert the types within your queries as per your requirement. There are various operators for the conversion from one type to another (including conversion of numbers within strings and vice-versa). $convert is one such operator.There are also operators to check the data type is of specific type, before or during processing.
Query operator $type example, finds documents where the
price
field type is string:Aggregate operator $type example, gets all documents with a new field telling the data type of a specific field:
Notes:
Having mixed data types is something one comes across due to data feeds from various sources or systems. I suspect you need to handle these as needed - identifying and handling at some stage of the application. Regarding indexing on mixed types, I think there are indexing features you can take advantage of based upon your situation: Collation and Partial Indexes.