If this is a single purpose machine, I would drop shared_buffers to a small fraction of what you have and increase effective_cache_size to 60GB. My reasoning is that the PostgreSQL caching is very featureful but it is also slower than the OS cache. On a multi-purpose machine, the Pg cache can be seen as reserved memory for Pg caching and that's quite helpful, but on a single purpose machine, there are significant tradeoffs between Pg cache and OS cache and they don't go all one way. The best use of the PostgreSQL cache is to keep the most commonly used data so that it never expires from the OS disk cache. Keep in mind that maintaining a buffer pool is expensive comparatively speaking and so getting the Pg buffer pool cache to the right size (not necessarily the highest size you can!) is critical.
A good place to start is actually with your old VM's PostgreSQL settings but upping effective_cache_size appropriately. This ensures you aren't having unexpected tradeoffs and can more easily compare your existing use cases. From there you need to look at your use cases and determine what values to tweak (this applies to work_mem also. If work_mem was sufficient on the old instance, there is very little to be gained from increasing it here).
Finally it would be helpful if you could run EXPLAIN (ANALYZE, BUFFERS) [query] on both systems if that is still not up to speed, so we can compare plans and offer better recommendations.
There are a couple of separate points here, but I don't think how MongoDB stores data in RAM is really relevant here - MongoDB just uses the mmap()
call and then lets the kernel take care of the memory management (the Linux kernel will use Least Recently Used (LRU) by default to decide what to page out and what to keep - there are more specifics to that but it's not terribly relevant).
In terms of your issues, it sounds like you might have had a corrupt index, though the evidence is somewhat circumstantial. Now that you have done a repair (the validate() command would have confirmed/denied beforehand), there won't be any evidence in the current data but you may find more evidence in the logs, particularly when you were attempting to recreate the index, or using the index in queries.
As for the spikes in the page faults, btree stats, journal, lock percentage, and average flush time, that has all the hallmarks of a bulk delete that causes a lot of index updates, and causes a large amount of IO. The fact that mapped memory drops off later in the graphs would suggest that once you ran the repair the storage size was significantly reduced, which usually indicates significant fragmentation (bulk deletes, along with updates that grow documents are the leading causes of fragmentation).
Therefore, I would look for a large delete operation logged as slow in the logs - it will only be logged once complete, so look for it to appear after the end of the events in MMS. One of the quirks of not running in a replica set is that a bulk operation like this is relatively non-obvious - it shows up as a single delete operation in the MMS graphs (usually lost in the noise).
These bulk delete operations usually tend to be run on older data that has not been recently used and has hence been paged out of active memory by the kernel (LRU again). To delete it you must page it all the data back in, then flush the changes to disk, and of course deletes require the write lock, hence the spikes in faults, lock percentage etc.
To make room for the deleted data, your current working set is paged out, which will hit performance on your normal usage until the deletes complete and the memory pressure eases.
FYI - when you run a replica set, bulk ops are serialized in the oplog
and hence replicated one at a time - as such you can track such operations by their footprints in the replicated ops stats of the secondaries. This is not possible with a standalone instance (without looking in the logs for the completed ops) and other secondary indications.
As for managing large deletes in the future, it is generally far more efficient to partition your data into separate databases (if possible) and then drop the old data when it is no longer needed by simply dropping the old databases. This requires some extra management on the application side but it negates the need for bulk deletes, is far quicker to complete, limits fragmentation, and dropped databases also remove the files on disk, preventing excessive storage use. Definitely recommended if possible with your use case.
Best Answer
I can add, that WiredTiger and mongo 3.2.11 will not help you. Theoretically it hasn't global locks, but in practice I have a bit more beefier server (12 core, 96GB RAM) and two databases. DB1 is in production (it is only 300GB in size, so mostly in RAM) and DB2 which I prepare for next release.
Surprisingly, found that mass updates and deletes on DB2 (also 300GB), even in one thread (one console or java application) make regular indexed queries on DB1 run for 5-10 SECONDS.
When I stop all activity on DB2 it returns to 3-400ms.
In other topics I've read that deleting by unindexed field might lead to such behaviour, but unfortunately, not in my case. I update/delete by "_id" and still performance is terrible.