Linux – Why does linux out-of-memory (OOM) killer not run automatically, but works upon sysrq-key

linuxout of memory

I have found that when running into an out-of-memory OOM situation, my linux box UI freezes completely for a very long time.

I have setup the magic-sysrq-key then using echo 1 | tee /proc/sys/kernel/sysrq and encountering a OOM->UI-unresponsive situation was able to press Alt-Sysrq-f which as dmesg log showed causes the OOM to terminate/kill a process and by this resolve the OOM situation.

My question is now. Why does linux become unresponsive as much as the GUI froze, however did seem not to trigger the same OOM-Killer, which I did trigger manually via Alt-Sysrq-f key combination?

Considering that in the OOM "frozen" situation the system is so unresponsive as to not even allow a timely (< 10sec) response to hitting the Ctrl-Alt-F3(switch to tty3), I would have to assume the kernel must be aware its unresponsiveness, but still did not by itself invoke the Alt-Sysrq-f OOM-Killer , why?

These are some settings that might have an impact on the described behaviour.

$> mount | grep memory
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
$> cat /sys/fs/cgroup/memory/memory.oom_control 
oom_kill_disable 0
under_oom 0
oom_kill 0

which while as I understand states that the memory cgroup does not have OOM neithe activated nor disabled (evidently there must be a good reason to have the OOM_kill active and disabled, or maybe I cannot interpret correctly the output, also the under_oom 0 is somewhat unclear, still)

Best Answer

The reason the OOM-killer is not automatically called is, because the system, albeit completely slowed down and unresponsive already when close to out-of-memoryy, has not actually reached the out-of-memory situation.

Oversimplified the almost full ram contains 3 type of data:

  1. kernel data, that is essential
  2. pages of essential process data (e.g. any data the process created in ram only)
  3. pages of non-essential process data (e.g. data such as the code of executables, for which there is a copy on disk/ in the filesystem, and which while being currently mapped to memory could be "reread" from disk upon usage)

In a memory starved situation the linux kernel as far as I can tell it is kswapd0 kernel thread, to prevent data loss and functionality loss, cannot throw away 1. and 2. , but is at liberty to at least temporarily remove those mapped-into-memory-files data from ram that is form processes that are not currently running.

While this is behaviour which involves disk-thrashing, to constantly throw away data and reread it from disk, can be seen as helpful as it avoids, or at least postpones the necessariry removing/killing of a process and the freeing-but-also-loosing of its memory, it has a high price: performance.

[load pages from disk to ram with code of executable of process 1]
[ run process 1 ] 
[evict pages with binary of process 1 from ram]
[load pages from disk to ram with code of executable of process 2]
[ run process 2 ] 
[evict pages with binary of process 2 from ram]
[load pages from disk to ram with code of executable of process 3]
[ run process 3 ] 
[evict pages with binary of process 3 from ram]
....
[load pages from disk to ram with code of executable of process 1]
[ run process 1 ] 
[evict pages with binary of process 1 from ram]

is clearly IO expensive and the system is likely to become unresponsive, event though technically it has not yet run out completely of memory.

From a user persepective however it seems, to be hung/frozen and the resulting unresponsive UI might not be really preferable, over simply killing the process (e.g. of a browser tab, whose memory usage might have very well been the root cause/culprit to begin with.)

This is where as the question indicated the Magic SysRq key trigger to start the OOM manually seems great, as the Magic SysRq is less impacted by the unresponsiveness of the system.

While there might be use-cases where it is important to preserve the processes at all (performance) costs, for a desktop, it is likely that uses would prefere the OOM-killer over the frozen UI. There is patch that claims to exempt clean mapped fs backed files from memory in such situation in this answer on stackoverflow.

Related Question