we ran into an interesting bug today. on our servers we put users into cgroup folders to monitor + control usage of resources like cpu and memory. we started getting errors when trying to add user-specific memory cgroup folders:
mkdir /sys/fs/cgroup/memory/users/newuser
mkdir: cannot create directory ‘/sys/fs/cgroup/memory/users/newusers’: Cannot allocate memory
That seemed a little strange, because the machine actually had a reasonable amount of free memory and swap. Changing the sysctl
values for vm.overcommit_memory
from 0 to 1 had no effect.
We did notice that we were running with quite a lot of user-specific subfolders (about 7,000 in fact), and most of them were for users that were no longer running processes on that machine.
ls /sys/fs/cgroup/memory/users/ | wc -l
7298
deleting unused folders in the cgroup hierarchy actually fixed the problem
cd /sys/fs/cgroup/memory/users/
ls | xargs -n1 rmdir
# errors for folders in-use, succeeds for unused
mkdir /sys/fs/cgroup/memory/users/newuser
# now works fine
interestingly, the problem only affected the memory cgroup. the cpu/accounting cgroup was fine, even though it actually had more users in the hierarchy:
ls /sys/fs/cgroup/cpu,cpuacct/users/ | wc -l
7450
mkdir /sys/fs/cgroup/cpu,cpuacct/users/newuser
# fine
So, what was causing these out-of-memory errors? Does the memory-cgroup subsystem itself have some sort of memory limit of its own?
contents of cgroup mounts may be found here
Best Answer
There are indeed limits per cgroup, you can read about them on LWN.net:
The maximum amount of memory is stored in /sys/fs/cgroup/memory/memory.limit_in_bytes. If the problem you experienced was really connected with cgroup memory limit, then /sys/fs/cgroup/memory/memory.max_usage_in_bytes should be close to the above, which you can also check by inspecting memory.failcnt, which records the number of times your actual usage hit the limit above.
Perhaps you may also check memory.kmem.failcnt and memory.kmem.tcp.failcnt for similar statistics on kernel memory and tcp buffer memory.