creating a sub collection should offer better read performance
see http://docs.mongodb.org/manual/core/data-modeling/
I think it would be a good idea to add an id to each subdocument so it is easier to access it (because from your schema you may have identical items)
like:
{
username: "joe",
password: "...",
items: [
subid:{
text: "This is a task",
completed: 1
},
...
]
},
...
you can use subid = ObjectId()
to autogenerate ids, or create this in your application.
I don't think there is a problem with having thousands of items in a document just remember that the maximum document size is 16 megabytes.
My answer is : It depends.
If you are accessing files by _id field, which is already indexed then you don't need to add more memory soon.
The _id field which is type of ObjectID is 12 byte in size. That means it can hold up to 2^(12*8) files. 3 byte is for machine ID which is a hash value and has a fixed vale on the machine can be subtracted which gives you approx 2^72 files. For the reference, 2^20 is 1,048,576.
In terms of the memory, the index on the _id field needs 10,000,000 x 12 byte = 114 MiBytes. To be honest, I don't now how much overhead there will be for an index which holds 10 millions value but I don't think that it will need more than 1 Gigabyte.
Now, if your _id field is not a type of ObjectID than do the math.
In the gridfs, filename value of the files collection is also indexed. If you are not accessing files using filename, then you may leave it blank and drop the index for the filename.
On the other side, if you will add some metadata to the files you added and want to query the files according to those metadata, then you should have indexes for those metadata and do the math again.
I have a production environment which has over 3,000,000 pdf files (takes 180 Gig space on the disk). My server is a virtual server which has 4 vCPU and 4 Gig RAM, still no problem. The specs you provided is way too high for your needs. You can save billions of files with those servers. Especially if you have SSD. Because even if your indexes do not fit into the memory, swapping will be very fast, you won't even notice a slowdown.
Best Answer
If you are using MongoDB 3.0 + MMAPv1 with
noPadding
allocation (aka "exact fit"), then the document will have to move if you change from 32-bit ints (4 bytes) to 64-bit longs (8 bytes). WithnoPadding
there is no extra space available to store more bytes, so you should start with NumberLongs to avoid expected document growth.You can confirm the outcome via the database profiler. Here's a quick run in the 3.0.7
mongo
shell: