On one of our MySQL master, OOM Killer got invoked and killed MySQL server which lead to big outage. Following is the kernel log:
[2006013.230723] mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
[2006013.230733] Pid: 1319, comm: mysqld Tainted: P 2.6.32-5-amd64 #1
[2006013.230735] Call Trace:
[2006013.230744] [<ffffffff810b6708>] ? oom_kill_process+0x7f/0x23f
[2006013.230750] [<ffffffff8106bde2>] ? timekeeping_get_ns+0xe/0x2e
[2006013.230754] [<ffffffff810b6c2c>] ? __out_of_memory+0x12a/0x141
[2006013.230757] [<ffffffff810b6d83>] ? out_of_memory+0x140/0x172
[2006013.230762] [<ffffffff810baae8>] ? __alloc_pages_nodemask+0x4ec/0x5fc
[2006013.230768] [<ffffffff812fca02>] ? io_schedule+0x93/0xb7
[2006013.230773] [<ffffffff810bc051>] ? __do_page_cache_readahead+0x9b/0x1b4
[2006013.230778] [<ffffffff810652f8>] ? wake_bit_function+0x0/0x23
[2006013.230782] [<ffffffff810bc186>] ? ra_submit+0x1c/0x20
[2006013.230785] [<ffffffff810b4e53>] ? filemap_fault+0x17d/0x2f6
[2006013.230790] [<ffffffff810cae1e>] ? __do_fault+0x54/0x3c3
[2006013.230794] [<ffffffff812fce29>] ? __wait_on_bit_lock+0x76/0x84
[2006013.230798] [<ffffffff810cd172>] ? handle_mm_fault+0x3b8/0x80f
[2006013.230803] [<ffffffff8103a9a0>] ? pick_next_task+0x21/0x3c
[2006013.230808] [<ffffffff810168ba>] ? sched_clock+0x5/0x8
[2006013.230813] [<ffffffff81300186>] ? do_page_fault+0x2e0/0x2fc
[2006013.230817] [<ffffffff812fe025>] ? page_fault+0x25/0x30
This machine has 64GB RAM.
Following are the mysql config variables:
innodb_buffer_pool_size = 48G
innodb_additional_mem_pool_size = 512M
innodb_log_buffer_size = 64M
Except some of the nagios plugins and metric collection scripts, nothing else runs on this machine. Can someone help me to find out why OOM killer got invoked and how can i prevent it to get invoked in future. Is there any way I can tell OOM killer not to kill mysql server. I know we can set oom_adj
value to very less for a process to prevent it from getting killed by OOM killer. But is there any other way to prevent this.
Best Answer
Linux does memory overcommit. That means it allows process to request more memory than really available on the system. When a program tries to malloc(), the kernel says "OK you got the memory", but don't reserve it. The memory will only be reserved when the process will write something in this space.
To see the difference, you have 2 indicators: Virtual Memory and Resident Memory. Virtual is the memory requested by the process, Resident is the memory really used by the process.
With this system, you may go into "overbooking", kernel grants more memory than available. Then, when your system goes on 0 byte of free memory and Swap, he must sacrifice (kill) a process to gain free memory.
That's when OOM Killer goes into action. The OOM selects a process based on his memory consumption, and many other elements (parent gains 1/2 of the score of his children; if it's a root owned process, score is divided by 4, etc.. Have a look on Linux-MM.org/OOM_Killer
You can influence on the OOM scoring by tunning the
/proc/MySQL_PID/oom_adj
file. By setting it to-17
, your process will never be killed. But before doing that, you should tweak your MySQL configuration file in order to limit MySQL memory usage. Otherwise, the OOM Killer will kill other system process (like SSH, crontab, etc...) and your server will be in a very unstable state, maybe leading to data corruption which is worse than anything.Also, you may consider using more swap.
[EDIT]
You may also change it's overcommit behaviour via these 2 sysctls :
As stated in Kernel Documentation
[/EDIT]