Yes.
You should most definitely always have swap enabled, except if there is a very compelling, forbidding reason (like, no disk at all, or only network disk present). Should you have a swap on the order of the often recommended ridiculous sizes (such as, twice the amount of RAM)? Well, no.
The reason is that swap is not only useful when your applications consume more memory than there is physical RAM (actually, in that case, swap is not very useful at all because it seriously impacts performance). The main incentive for swap nowadays is not to magically turn 16GiB of RAM into 32 GiB, but to make more efficient use of the installed, available RAM.
On a modern computer, RAM does not go unused. Unused RAM is something that you could just as well not have bought and saved the money instead. Therefore, anything you load or anything that is otherwise memory-mapped, anything that could possibly be reused by anyone any time later (limited by security constraints) is being cached. Very soon after the machine has booted, all physical RAM will have been used for something.
Whenever you ask for a new memory page from the operating system, the memory manager has to make an educated decision:
- Purge a page from the buffer cache
- Purge a page from a mapping (effectively the same as #1, on most systems)
- Move a page that has not been accessed for a long time -- preferably never -- to swap (this could in fact even happen proactively, not necessarily at the very last moment)
- Kill your process, or kill a random process (OOM)
- Kernel panic
Options #4 and #5 are very undesirable and will only happen if the operating system has absolutely no other choice. Options #1 and #2 mean that you throw something away that you will possibly be needing soon again. This negatively impacts performance.
Option #3 means you move something that you (probably) don't need any time soon onto slow storage. That's fine because now something that you do need can use the fast RAM.
By removing option #3, you have effectively limited the operating system to doing either #1 or #2. Reloading a page from disk is the same as reloading it from swap, except having to reload from swap is usually less likely (due to making proper paging decisions).
In other words, by disabling swap you gain nothing, but you limit the operation system's number of useful options in dealing with a memory request. Which might not be, but very possibly may be a disadvantage (and will never be an advantage).
[EDIT]
The careful reader of the mmap
manpage, specifically the description of MAP_NORESERVE
, will notice another good reason why swap is somewhat of a necessity even on a system with "enough" physical memory:
"When swap space is not reserved one might get SIGSEGV upon a write if no physical memory is available."
-- Wait a moment, what does that mean?
If you map a file, you can access the file's contents directly as if the file was somehow, by magic, in your program's address space. For read-only access, the operating system needs in principle no more than a single page of physical memory which it can repopulate with different data every time you access a different virtual page (for efficiency reasons, that's of course not what is done, but in principle you could access terabytes worth of data with a single page of physical memory). Now what if you also write to a file mapping? In this case, the operating system must have a physical page -- or swap space -- ready for every page written to. There's no other way to keep the data around until the dirty pages writeback process has done its work (which can be several seconds). For this reason, the OS reserves (but doesn't necessarily ever commit) swap space, so in case you are writing to a mapping while there happens to be no physical page unused (that's a quite possible, and normal condition), you're guaranteed that it will still work.
Now what if there is no swap? It means that no swap can be reserved (duh!), and this means that as soon as there are no free physical pages left, and you're writing to a page, you are getting a pleasant surprise in the form of your process receiving a segmentation fault, and probably being killed.
[/EDIT]
However, the traditional recommendation of making swap twice the size of RAM is nonsensical. Although disk space is cheap, it does not make sense to assign that much swap. Wasting something that is cheap is still wasteful, and you absolutely don't want to be continually swapping in and out working sets several hundreds of megabytes (or larger) in size.
There is no single "correct" swap size (there are as many "correct" sizes as there are users and opinions). I usually assign a fixed 512MiB, regardless of RAM size, which works very well for me. The reasoning behind that is that 512MiB is something that you can always afford nowadays, even on a small disk. On the other hand, adding several gigabytes of swap is none better. You are not going to use them, except if something is going seriously wrong.
Even on a SSD, swap is orders of magnitude slower than RAM (due to bus bandwidth and latency), and while it is very acceptable to move something to swap that probably won't be needed again (i.e. you most likely won't be swapping it in again, so your pool of available pages is effectively enlarged for free), if you really need considerable amounts of swap (that is, you have an application that uses e.g. a 50GiB dataset), you're pretty much lost.
Once your computer starts swapping in and out gigabytes worth of pages, everything goes to a crawl. So, for most people (including me) this is not an option, and having that much swap therefore makes no sense.
A little read to better understand each column because it's not just disk or memory but also shared libraries :
VIRT stands for the virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance the video card’s RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the program is able to access at the present moment.
RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming. (This also corresponds directly to the %MEM column.) This will virtually always be less than the VIRT size, since most programs depend on the C library.
SHR indicates how much of the VIRT size is actually sharable (memory or libraries). In the case of libraries, it does not necessarily mean that the entire library is resident. For example, if a program only uses a few functions in a library, the whole library is mapped and will be counted in VIRT and SHR, but only the parts of the library file containing the functions being used will actually be loaded in and be counted under RES.
I think that you've a problem with evolution (database job) but I can't verify because I don't use evolution, sorry
Best Answer
You're asking me to guess and put an upper bound on it.
I can try to share my experience. I won't say you shouldn't ask for high standards, I just want to be realistic about the standard that Linux currently meets :-).
With your amount of RAM, swap, and type of storage. If the RAM usage is due to multiple interactive apps. Only one of them is being interacted with. You hadn't left an operation running in any of the other apps. And the other apps don't have a large number of tabs with animated advertisements in them :). In that case, I think you make a good point! My current intuition says it would be unusual to take longer than 10 minutes, for the system to clear up and be workable.
Do I think you should ever wait 10 minutes, hoping the mouse cursor will start working and the disk light will calm down again?
Not exactly. If the hope is to wait for the GUI to settle and be usable? If it even takes as long as 2 minutes for the GUI to be usable, and my goal isn't just to start closing some windows of the current foreground app? You'd expect this delay to keep happening again. That's clearly too long.
But secondly, there are various possible problems this could be. For example if half the problem is a concurrent operation in a second program, that I'm not aware of, then the system working set could be larger than RAM. In that case, yes, the system could thrash for more hours than you can count. So you wouldn't want to wait.
If I was trying to get insights about what was going wrong, my maximum timeout might be 15 minutes. This would be the overall timeout for getting data out of a sequence such as:
sudo tmux
- text window manager. Now I can run multiple commands as root, and switch between them, without getting login or sudo delays.atop -R
- did I mention I loveatop
?iotop
- horrible delays are usually about I/O. This is a nice tool that does one thing :-).journalctl --since=-1hour -f