Linux – OOM Killer – killed MySQL server

linuxMySQLout of memory

On one of our MySQL master, OOM Killer got invoked and killed MySQL server which lead to big outage. Following is the kernel log:

[2006013.230723] mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
[2006013.230733] Pid: 1319, comm: mysqld Tainted: P           2.6.32-5-amd64 #1
[2006013.230735] Call Trace:
[2006013.230744]  [<ffffffff810b6708>] ? oom_kill_process+0x7f/0x23f
[2006013.230750]  [<ffffffff8106bde2>] ? timekeeping_get_ns+0xe/0x2e
[2006013.230754]  [<ffffffff810b6c2c>] ? __out_of_memory+0x12a/0x141
[2006013.230757]  [<ffffffff810b6d83>] ? out_of_memory+0x140/0x172
[2006013.230762]  [<ffffffff810baae8>] ? __alloc_pages_nodemask+0x4ec/0x5fc
[2006013.230768]  [<ffffffff812fca02>] ? io_schedule+0x93/0xb7
[2006013.230773]  [<ffffffff810bc051>] ? __do_page_cache_readahead+0x9b/0x1b4
[2006013.230778]  [<ffffffff810652f8>] ? wake_bit_function+0x0/0x23
[2006013.230782]  [<ffffffff810bc186>] ? ra_submit+0x1c/0x20
[2006013.230785]  [<ffffffff810b4e53>] ? filemap_fault+0x17d/0x2f6
[2006013.230790]  [<ffffffff810cae1e>] ? __do_fault+0x54/0x3c3
[2006013.230794]  [<ffffffff812fce29>] ? __wait_on_bit_lock+0x76/0x84
[2006013.230798]  [<ffffffff810cd172>] ? handle_mm_fault+0x3b8/0x80f
[2006013.230803]  [<ffffffff8103a9a0>] ? pick_next_task+0x21/0x3c
[2006013.230808]  [<ffffffff810168ba>] ? sched_clock+0x5/0x8
[2006013.230813]  [<ffffffff81300186>] ? do_page_fault+0x2e0/0x2fc
[2006013.230817]  [<ffffffff812fe025>] ? page_fault+0x25/0x30

This machine has 64GB RAM.

Following are the mysql config variables:

innodb_buffer_pool_size        = 48G
innodb_additional_mem_pool_size = 512M
innodb_log_buffer_size         = 64M

Except some of the nagios plugins and metric collection scripts, nothing else runs on this machine. Can someone help me to find out why OOM killer got invoked and how can i prevent it to get invoked in future. Is there any way I can tell OOM killer not to kill mysql server. I know we can set oom_adj value to very less for a process to prevent it from getting killed by OOM killer. But is there any other way to prevent this.

Best Answer

Linux does memory overcommit. That means it allows process to request more memory than really available on the system. When a program tries to malloc(), the kernel says "OK you got the memory", but don't reserve it. The memory will only be reserved when the process will write something in this space.

To see the difference, you have 2 indicators: Virtual Memory and Resident Memory. Virtual is the memory requested by the process, Resident is the memory really used by the process.

With this system, you may go into "overbooking", kernel grants more memory than available. Then, when your system goes on 0 byte of free memory and Swap, he must sacrifice (kill) a process to gain free memory.

That's when OOM Killer goes into action. The OOM selects a process based on his memory consumption, and many other elements (parent gains 1/2 of the score of his children; if it's a root owned process, score is divided by 4, etc.. Have a look on Linux-MM.org/OOM_Killer

You can influence on the OOM scoring by tunning the /proc/MySQL_PID/oom_adj file. By setting it to -17, your process will never be killed. But before doing that, you should tweak your MySQL configuration file in order to limit MySQL memory usage. Otherwise, the OOM Killer will kill other system process (like SSH, crontab, etc...) and your server will be in a very unstable state, maybe leading to data corruption which is worse than anything.

Also, you may consider using more swap.

[EDIT]

You may also change it's overcommit behaviour via these 2 sysctls :

vm.overcommit_memory
vm.overcommit_ratio

As stated in Kernel Documentation

overcommit_memory:

This value contains a flag that enables memory overcommitment.

When this flag is 0, the kernel attempts to estimate the amount of free memory left when userspace requests more memory.

When this flag is 1, the kernel pretends there is always enough memory until it actually runs out.

When this flag is 2, the kernel uses a "never overcommit" policy that attempts to prevent any overcommit of memory. Note that user_reserve_kbytes affects this policy.

This feature can be very useful because there are a lot of programs that malloc() huge amounts of memory "just-in-case" and don't use much of it.

The default value is 0.

See Documentation/vm/overcommit-accounting and security/commoncap.c::cap_vm_enough_memory() for more information.

overcommit_ratio:

When overcommit_memory is set to 2, the committed address space is not permitted to exceed swap plus this percentage of physical RAM. See above.

[/EDIT]

Related Question