Mongodb moving value from NumberInt to NumberLong

mongodb

We are trying to preallocate documents in a collection. Usually NumberInt for our values is sufficient. When we get 1million writes into that document (it is one doc per day time series, so 1 million is possible), then the value moves from NumberInt to NumberLong.

We are planning to disable padding, so any single byte movement will move the document.

The question is, whether mongodb will move the document if the NumberInt changes to NumberLong

Best Answer

If you are using MongoDB 3.0 + MMAPv1 with noPadding allocation (aka "exact fit"), then the document will have to move if you change from 32-bit ints (4 bytes) to 64-bit longs (8 bytes). With noPadding there is no extra space available to store more bytes, so you should start with NumberLongs to avoid expected document growth.

The question is, whether mongodb will move the document if the NumberInt changes to NumberLong

You can confirm the outcome via the database profiler. Here's a quick run in the 3.0.7 mongo shell:

// Drop test collection
db.mycollection.drop();

// Create new test collection with noPadding (exact-fit allocation)
db.createCollection("mycollection", { noPadding: true });

// Enabling profiling of all operations
db.setProfilingLevel(2)

// Insert a document with a 32-bit int
db.mycollection.insert({ _id: 'nopadding', value: NumberInt(42) });

// Update document to have a 64-bit int
db.mycollection.update({ _id: 'nopadding' }, { $set: { value: NumberLong(10000000) }});

// Disable profiling
db.setProfilingLevel(0);

// Check if our update operation resulted in a move
db.system.profile.find({op:'update', 'query._id': 'nopadding', moved: true}).count()
1

Related Solutions

Mongodb – Structuring data in MongoDB for a to-do list type application

creating a sub collection should offer better read performance see http://docs.mongodb.org/manual/core/data-modeling/

I think it would be a good idea to add an id to each subdocument so it is easier to access it (because from your schema you may have identical items)

like:

{
    username: "joe",
    password: "...",
    items: [
        subid:{
            text: "This is a task",
            completed: 1
        },
        ...
    ]
},
...

you can use subid = ObjectId() to autogenerate ids, or create this in your application.

I don't think there is a problem with having thousands of items in a document just remember that the maximum document size is 16 megabytes.

MongoDB scaling when working with files

My answer is : It depends. If you are accessing files by _id field, which is already indexed then you don't need to add more memory soon.

The _id field which is type of ObjectID is 12 byte in size. That means it can hold up to 2^(12*8) files. 3 byte is for machine ID which is a hash value and has a fixed vale on the machine can be subtracted which gives you approx 2^72 files. For the reference, 2^20 is 1,048,576.

In terms of the memory, the index on the _id field needs 10,000,000 x 12 byte = 114 MiBytes. To be honest, I don't now how much overhead there will be for an index which holds 10 millions value but I don't think that it will need more than 1 Gigabyte.

Now, if your _id field is not a type of ObjectID than do the math.

In the gridfs, filename value of the files collection is also indexed. If you are not accessing files using filename, then you may leave it blank and drop the index for the filename.

On the other side, if you will add some metadata to the files you added and want to query the files according to those metadata, then you should have indexes for those metadata and do the math again.

I have a production environment which has over 3,000,000 pdf files (takes 180 Gig space on the disk). My server is a virtual server which has 4 vCPU and 4 Gig RAM, still no problem. The specs you provided is way too high for your needs. You can save billions of files with those servers. Especially if you have SSD. Because even if your indexes do not fit into the memory, swapping will be very fast, you won't even notice a slowdown.

Best Answer

Related Solutions

Mongodb – Structuring data in MongoDB for a to-do list type application

MongoDB scaling when working with files

Related Question