Mongodb – Impact of RAM size on mongodb performance

mongodb

I'm trying to test the impact of the size of RAM on the performance of mongodb. So far I can't find such impact and trying to figure out if the methodology I use is correct and if not, what to change.

I used read-only scenario on big database that occupies 1.4TB of disk space using server with 768GB of RAM and again with same hardware but 64GB instead.
In both cases, I reached very similar throughput which seems strange to me.
In addition, I can't find evidence for significant usage of RAM which is completely not clear to me.

Details of my tests:

  • server: mongodb 4.0.6 (with wiredTiger storage engine)

  • I'm using YCSB as a benchmark client. I picked workload-c which includes read-only actions. I changed in the configuration the value of 'requestdistribution' from 'zipfian' to 'uniform'. The purpose is to avoid as much as possible, reading the same data recently read, which is most likely already cashed.

  • loading the database:

    • Using YCSB, I loaded the database with:
      • 1,200,000,000 documents
      • 10 fields per document
      • 100 bytes per field

This takes ~1.5TB of disk space. Mongodb reported (db.stats()) the following:

        "db" : "ycsb",
        "collections" : 1,
        "views" : 0,
        "objects" : 1,200,000,000         # below there's an example of a sample document
        "avgObjSize" : 1167.8795327608334,
        "dataSize" : 1,401,455,439,313,
        "storageSize" : 1510564220928,    # =1.4TB - as expected
        "numExtents" : 0,
        "indexes" : 1,
        "indexSize" : 85,697,032,192,     # ~80GB
        "fsUsedSize" : 2383349882880,
        "fsTotalSize" : 3199068983296,
        "ok" : 1
}
  • example of one document:
{ "_id" : "user6284781860667377211", 
    "field1" : BinData(0,"KkQjPEYjKE1lMkt/KDI0OTBiNyM0MDF0JyB8L0QpNSEkLDgqL1o3ISVmKi4qIEFrNUU7JCtgKzAgIUVzJjk+MSRwNi5kMyFoIzwgPTkmPi1kMlAjPE53L10tMid8PlN9IE4hMA=="),
    "field0" : BinData(0,"IEk7MUVjJ15rJ0xpOkghJUtxO1h7Nlo9Li9iNiM4Pjk2PDEkKEc3IV5/PTlsKjFiMTU4OTIsICEqNFkpOlQ1OFYtNEYnOFYhNEohNUEtIyI6NSBsJFhzNykqLz82MFQ1NF11Pw=="),
    "field7" : BinData(0,"JiYyOEMxIVx7PlI/LDMuOCE6PDFgPjJiL181NSZuMCU6KCR4KV43NUQlPFZ1Olt1MFc1ME45LzpiPzVqJlI5OCx0OU9rO10zNFtpKy9qNT9uNk4lIFd5PyV0OUh7Ij9gNUxzMg=="),
    "field6" : BinData(0,"PjMkPFcpKiBiLyI+OiZsMjwuKzpyKlovKjYmPCs0K1FlNiYyLVN/Okx1NSY0NVRjJDUuKDkkLFUzKSkmNStyPlRlJT48LFhlMjhyPl8tNDcgNVM1ODUkLSs+IF89LSViKCt4Mw=="),
    "field9" : BinData(0,"LjwuOS5wMUwpIVIhOFFvJ01vNl4tPCloMS08JSRsKVE3LFAlO1htJi8qLCI0IF45LD0+ISM4MjQ2JCc8PVA1LVxnMjo2KEFnOC1wNyJiLklnIzBkIDYgOFZnMT1gL0YlMyEgJA=="),
    "field8" : BinData(0,"PlBtOyM+NUhlOzIwMSBwNzUmJ1ojPEE5LyEmNVUhJDt0I0pvJytgNy0kNiogJVR3MkYtIV4/PDRkM1k7Ii14OixoLz0sMTcoOER/MkxrIVpzN1crPlx1OlE3OER1Ky1mIFdpJg=="),
    "field3" : BinData(0,"NDV+P0MtPTRgOkRjLEFhP1NrK0gvNEZnMkQ/LTpkNFl3O1o1I0EzODUiJUIhJC4uIU0vMFJxMD4+PDRkLkJlI0gtJ0QlKCwsPVh1JzsmNTs4NDg0O0ptMj4gLz9sKl4vMzo8LQ=="),
    "field2" : BinData(0,"N1VzJjMkJFVhL1c5O1BrLk1tNyxkNkNtLDQyKEQlPC8yIF0lMSNsLyEuLlZpMjh8IU81PD5uNzZuMihqMEl/IjZkPysuNDZ2Kk9rOy12K007LTwgK0YjIDMyPyo8OCJsKzomMQ=="),
    "field5" : BinData(0,"JFA5MTh8MS0+IVh/Myh4M14nKFM5KVFhJDt2Mlo1MlZtJydyLSFkMl81PUAtNU4tMT04LS8oJEg9KTt2MTMkIUc9OTo4LFM5PEthPkgjLDdmKlI/NTg6KV0hNlwvKS4gPTM2OA=="),
    "field4" : BinData(0,"KVxvJiAuMT9oLjIqO1l1Ok9xLksjOCxgNCJkNzF6Nyp8N0w/OCQkMUl7LyVoM0NlIk85MUspNSF8Plc5NjV6KkE7MVRhKzU2P0EhJFghKlIvLCpwJUF9NzA0L1gtPSl6MVFnNg==") 
}
  • the index:
> db.usertable.getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "ycsb.usertable"
        }
]
>

I performed three tests. The results looks very similar:

- warm DB - 768GB RAM: throughput: 311,075 TPS 
- cold DB - 768GB RAM: throughput: 320,626 TPS 
- cold DB -  64GB RAM: throughput: 326,313 TPS 

'warm DB' is a test performed right after loading the database.

memory usage (collected with dstat):

used  buff  cach  free  |   test description
------------------------------------------------
352G 12.3M 65.3M  403G  - warm DB - 768GB RAM
9.6G  8.1M  208M  746G  - cold DB - 768GB RAM
5.3G 13.5M  219M 57.2G  - cold DB -  64GB RAM

Network usage (between the client/server) was "send by the server" of ~ 415MB/sec. This is very reasonable for throughput of 320,000 TPS each of 1400 bytes.

I expected to see in my test :

- significant usage of RAM or at least 80GB of index loaded to memory.
- clear impact of the size of RAM on the throughput.

Any clue ?

Best Answer

I found that although my database has 1,200,000,000 document, only 1000 documents where queried due to YCSB defaults. To chance, need to set parameter 'recordcount' to the number of records to consider.