MongoDB – Database vs Collection memory-wise

mongodb

Lets say for example i need to have 5 collections, each collection is about 10GB.
What is the difference in performance, with emphasis on memory usage, between assigning each said collection to a database, versus having all of these collections in the same database?

Also, in this scenario, whats the difference between MMAPv1 storage engine and WiredTiger?

EDIT: Spoken to a MongoDB team and they assured me there should be no difference between having multiple collections in a single database versus one collection for each multiple databases.

Best Answer

MongoDB - Database vs Collection memory-wise

As you said you have MongoDB version 2.4, the MongoDB documentation here and here specifications

WiredTiger Storage Engine

  1. Starting in MongoDB 3.2, the WiredTiger storage engine is the default storage engine.
  2. Minimum log record size for WiredTiger is 128 bytes. If a log record is 128 bytes or smaller, WiredTiger does not compress that record.

Memory Use

With WiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache.

Starting in 3.4, the WiredTiger internal cache, by default, will use the larger of either:

50% of (RAM - 1 GB), or
256 MB.

By default, WiredTiger uses Snappy block compression for all collections and prefix compression for all indexes. Compression defaults are configurable at a global level and can also be set on a per-collection and per-index basis during collection and index creation.

Note : For existing deployments, if you do not specify the --storageEngine or the storage.engine setting, the version 3.2+ mongod instance can automatically determine the storage engine used to create the data files in the --dbpath or storage.dbPath.

MMAPv1 Storage Engine

  1. MMAPv1 is the default storage engine for MongoDB versions 3.0 and earlier.
  2. MongoDB 4.0 deprecates the MMAPv1 Storage Engine and will remove MMAPv1 in a future release. To change your MMAPv1 storage engine deployment to WiredTiger Storage Engine, see:

    a.) Change Standalone to WiredTiger

    b.) Change Replica Set to WiredTiger

    c.) Change Sharded Cluster to WiredTiger

Memory Use

With MMAPv1, MongoDB automatically uses all free memory on the machine as its cache. System resource monitors show that MongoDB uses a lot of memory, but its usage is dynamic. If another process suddenly needs half the server’s RAM, MongoDB will yield cached memory to the other process.

Technically, the operating system’s virtual memory subsystem manages MongoDB’s memory. This means that MongoDB will use as much free memory as it can, swapping to disk as needed. Deployments with enough memory to fit the application’s working data set in RAM will achieve the best performance.

For example i need to have 5 collections, each collection is about 10GB. What is the difference in performance, with emphasis on memory usage, between assigning each said collection to a database, versus having all of these collections in the same database?

As per google group forum by Mr.Kevin Adistambha defined here it is not advisable to push any technology to such extreme numbers without careful thought due to security, availability, and performance issue.

MongoDB is perfectly capable of doing 5 collections , as each collection with 10 GB in a single database.

The page MongoDB Limits and Thresholds describes the limits of MongoDB capabilities in detail.

For further your ref here and here