Linux reboot out of memory

linuxout of memory

I have a server with Intel(R) Atom(TM) CPU D525 and 1 GB memory.
I noticed the server would shut down and restart automatically about every 7 days.

I checked the memory usage and found that when the memory usage reached 90%, the kernel reboots. When I checked the kernel log in /var/log/messages file, I didn't find anything about the kernel shutting down, just a message about the kernel start. I checked the file /proc/sys/vm/min_free_kbytes, the value is "3765".

I guess when the available memory is very low, but doesn't get to the number the system starts reclaiming memory. Then the kernel can't do anything, so it then reboots.

Can you give me some insight?

Best Answer

On some demand-paged virtual memory systems, the operating system refuses to allocate anonymous pages (i.e. pages containing data without a filesystem source such as runtime data, program stack etc.) unless there is sufficient swap space to swap out the pages in order to free up physical memory. This strict accounting has the advantage that each process is guaranteed access to as much virtual memory they allocate, but is also means that the amount of virtual memory available is essentially limited by the size of the swap space.

In practice, programs tend to allocate more memory than they use. For instance, the Java Virtual Machine allocates a lot of virtual memory on startup, but does not use it immediately. Memory accounting in the Linux kernel attempts to compensate for this by tracking the amount of memory actually in use by processes, and overcommits the amount of virtual memory. In other words the amount of virtual memory allocated by the kernel can exceed the amount of physical memory and swap space combined on the system. While this leads to better utilization of physical memory and swap space, the downside is that when the amount of of memory in use exceeds the amount of physical memory and swap space available, the kernel must somehow free memory resources in order to meet the memory allocation commitment.

The kernel mechanism that is used to reclaim memory to fill the overcommitment is called the out-of-memory-killer (OOM-killer). Typically the mechanism will start killing off memory-hogging "rogue" processes to free up memory for other processes. However, if the vm.panic_on_oom sysctl setting is non-zero, the kernel will panic instead when the system runs out of memory.

The possible values for the vm.panic_on_oom setting are as follows:

0 (default) When an out-of-memory situation arises, the OOM-killer will kill a rogue process.
1 The kernel normally panics, but if process that has reached its memory allocation limit set with mbind(MPOL_BIND) or cpuset, the process is killed instead.
2 The kernel always panics in an out-of-memory situation.

The heuristic used by the OOM-killer can be modified through the vm.oom_kill_allocating_task sysctl setting. The possible values are as follows:

0 (default) The OOM-killer will scan through the task list and select a task rogue task utilizing a lot of memory to kill.
1 (non-zero) The OOM-killer will kill the task that triggered the out-of-memory condition.

The kernel memory accounting algorithm can be tuned with the vm.overcommit_memory sysctl settings. The possible values are as follows:

0 (default) Heuristic overcommit with weak checks.
1 Always overcommit, no checks.
2 Strict accounting, in this mode the virtual address space limit is determined by the value of vm.overcommit_ratio settings according to the following formula:
```
virtual memory = (swap + physical memory * (overcommit_ratio / 100))
```

When strict memory accounting is in use, the kernel will no longer allocate anonymous pages unless it has enough free physical memory or swap space to store the pages. This means it is essential that the system is configured with enough swap space.

The sysctl settings can be checked or modified at runtime with the sysctl command. To make changes permanent the settings can be written to /etc/sysctl.conf. The above settings are also available via the /proc/sys/vm interface. The corresponding files are:

/proc/sys/vm/panic_on_oom
/proc/sys/vm/oom_kill_allocating_task
/proc/sys/vm/overcommit_memory
/proc/sys/vm/overcommit_ratio

Related Solutions

Linux – Tracking down “missing” memory usage in linux

The "memory used by a process" is not a clear cut concept in modern operating systems. What can be measured is the size of the address space of the process (SIZE) and resident set size (RSS, how many of the pages in the address space are currently in memory). Part of RSS is shared (most processes in memory share one copy of glibc, and so for assorted other shared libraries; several processes running the same executable share it, processes forked share read-only data and possibly a chunk of not-yet-modified read-write data with the parent). On the other hand, memory used for the process by the kernel isn't accounted for, like page tables, kernel buffers, and kernel stack. In the overall picture you have to account for the memory reserved for the graphics card, the kernel's use, and assorted "holes" reserved for DOS and other prehistoric systems (that isn't much, anyway).

The only way of getting an overall picture is what the kernel reports as such. Adding up numbers with unknown overlaps and unknown left outs is a nice exercise in arithmetic, nothing more.

Linux – Excessively high memory usage reported (3 GB) after reboot

This problem might be caused by an incorrect sizing of the maximum size of the connection tracking table and the hash table. The Linux kernel tries to allocate contiguous pages to track the connection tables for the iptables nf_conntrack module. As you don't have enough physical memory, conntrack fails back to vmalloc.

This table is not dynamically created based on established connections but, rather, fully allocated based on some kernel parameters.

Some additional symptoms might be finding a large number of nf_conntrack: falling back to vmalloc. messages in the /var/log/messages (or /var/log/kern.log, or in both).

This is easily solvable by just fine tuning your connection track table and sizing it down. Proper sizing has to be done based on the system usage. The connection track table needs to be high if you are running a dedicated network firewall in this system, but can be much lower if you are just using iptables to protect it from network intrusions.

For more information on connection tracking tuning please refer to https://wiki.khnet.info/index.php/Conntrack_tuning

To fine tune the values for your system, you can first evaluate the number of connections your system keeps open by running conntrack -L (or /sbin/sysctl net.netfilter.nf_conntrack_count). Better yet, keep a statistic of tracked connections over time (munin does this nicely) and use the maximum number of tracked connections as a baseline. Based on this information you can configure /etc/sysctl.conf accordingly.

When fine tuning make sure you also review how much time do you keep connections in the tracking table. Sometimes conntrack tables contain spurious data connections due to network misconfiguration or errors. For example, when the server receives SYN connections that are never closed or when client disconnects abruptly and leave open sockets for a long time.

Second, check if your conntrack entries make sense. Sometimes conntrack tables are filled with rubbish because of some network or firewall mis-configuration. Usually those are entries for connections which were never fully established. That may happen e.g. when the server gets incoming connection SYN packets, but the server replies are always lost somewhere on the network.

When fine tuning these values running sysctl -a | grep conntrack | grep timeout might be provide some insight. The default values are quite conservative: 600 (10 minutes) for generic timeouts and 432000 (5 days) for an established TCP connection. Depending on the system purpose and network behaviour those might need to be fined tuned to reduce the number of active connections in the conntrack table. Which will help define a lower value to it.

Make sure, however, that you do not size the conntrack table down too much as you can have the opposite effect: connections being dropped by iptables because they cannot be tracked and you will start having messages such as this in your log files: 'kernel: ip_conntrack: table full, dropping packet.'

In order to confirm if that is the problem please provide the output of the following:

cat /proc/sys/net/ipv4/ip_conntrack_max
cat /proc/sys/net/ipv4/netfilter/ip_conntrack_buckets

Best Answer

Related Solutions

Linux – Tracking down “missing” memory usage in linux

Linux – Excessively high memory usage reported (3 GB) after reboot

Related Question