Linux does memory overcommit. That means it allows process to request more memory than really available on the system. When a program tries to malloc(), the kernel says "OK you got the memory", but don't reserve it. The memory will only be reserved when the process will write something in this space.
To see the difference, you have 2 indicators: Virtual Memory and Resident Memory.
Virtual is the memory requested by the process, Resident is the memory really used by the process.
With this system, you may go into "overbooking", kernel grants more memory than available. Then, when your system goes on 0 byte of free memory and Swap, he must sacrifice (kill) a process to gain free memory.
That's when OOM Killer goes into action. The OOM selects a process based on his memory consumption, and many other elements (parent gains 1/2 of the score of his children; if it's a root owned process, score is divided by 4, etc.. Have a look on Linux-MM.org/OOM_Killer
You can influence on the OOM scoring by tunning the /proc/MySQL_PID/oom_adj
file. By setting it to -17
, your process will never be killed. But before doing that, you should tweak your MySQL configuration file in order to limit MySQL memory usage. Otherwise, the OOM Killer will kill other system process (like SSH, crontab, etc...) and your server will be in a very unstable state, maybe leading to data corruption which is worse than anything.
Also, you may consider using more swap.
[EDIT]
You may also change it's overcommit behaviour via these 2 sysctls :
vm.overcommit_memory
vm.overcommit_ratio
As stated in Kernel Documentation
overcommit_memory:
This value contains a flag that enables memory overcommitment.
When this flag is 0, the kernel attempts to estimate the amount
of free memory left when userspace requests more memory.
When this flag is 1, the kernel pretends there is always enough
memory until it actually runs out.
When this flag is 2, the kernel uses a "never overcommit"
policy that attempts to prevent any overcommit of memory.
Note that user_reserve_kbytes affects this policy.
This feature can be very useful because there are a lot of
programs that malloc() huge amounts of memory "just-in-case"
and don't use much of it.
The default value is 0.
See Documentation/vm/overcommit-accounting and
security/commoncap.c::cap_vm_enough_memory() for more information.
overcommit_ratio:
When overcommit_memory is set to 2, the committed address
space is not permitted to exceed swap plus this percentage
of physical RAM. See above.
[/EDIT]
oom_adj
is deprecated and provided for legacy purposes only. Internally Linux uses oom_score_adj
which has a greater range: oom_adj
goes up to 15 while oom_score_adj
goes up to 1000.
Whenever you write to oom_adj
(let's say 9) the kernel does this:
oom_adj = (oom_adj * OOM_SCORE_ADJ_MAX) / -OOM_DISABLE;
and stores that to oom_score_adj
. OOM_SCORE_ADJ_MAX
is 1000 and OOM_DISABLE
is -17.
So for 9 you'll get oom_adj=(9 * 1000) / 17 ~= 529.411
and since these values are integers, oom_score_adj
will hold 529.
Now when you read oom_adj
the kernel will do this:
oom_adj = (task->signal->oom_score_adj * -OOM_DISABLE) / OOM_SCORE_ADJ_MAX;
So for 529 you'll get: oom_adj = (529 * 17) / 1000 = 8.993
and since the kernel is using integers and integer arithmetic, this will become 8.
So there... you write 9 and you get 8 because of fixed point / integer arithmetic.
Best Answer
Several modern dæmon supervision systems have a means for doing this. (Indeed, since there is a chain loading tool for the job, arguably they all have a means for doing this.)
oom score
in the job file.OOMScoreAdjust=
setting in the service unit. You can use service unit patch files to affect pre-packaged service units.oom-kill-protect
tool from the nosh toolset in therun
program for the service.If you are converting a system service unit, the
As a bonus, you can make it parameterizable: and set the value of the parameter in the service's environment (presumed to be read from an envdir associated with the service, here manipulated with the nosh toolset'sconvert-systemd-units
tool will in fact convert theOOMScoreAdjust=
setting into such an invocation ofoom-kill-protect
.rcctl
shim):Further reading
oom-kill-protect
. nosh toolset. Softwares.oom score
". Upstart Cookbook.OOMScoreAdjust
".systemd.exec
. systemd manual pages. freedesktop.org.rcctl
. nosh toolset. Softwares.