Centos – cgroups memory limit – write error: Device or resource busy

centoscgroupsmemoryswap

I'm running CentOS 7 with kernel 3.10.0-693.5.2.el7.x86_64.
I use cgroups to apply a memory limit on processes. During rolling restart of the application the memory limit is doubled to accommodate the memory needs.

However sometimes after the restart it's not possible to lower the swap memory limit to the original value and cgroup returns an error write error: Device or resource busy

Such as

[root@us app]# echo "643825664" > memory.limit_in_bytes
[root@us app]# echo "673825664" > memory.memsw.limit_in_bytes
-bash: echo: write error: Device or resource busy
[root@us app]# echo "873825664" > memory.memsw.limit_in_bytes
[root@us app]#

Writing a bigger value (such as +200MB) seems to work ok.

I haven't figured out why this happens. I didn't find anything in the cgroup documentation which would refer to this error. I assume it has to do something with the current swap usage being higher than the limit.

Do you have any experience with such errors?

Best Answer

What does cat memory.memsw.usage_in_bytes say? You cannot set the max below the current limit.

Looking at the 3.10 Linux sources, modifying memsw.limit_in_bytes results in a call to mem_cgroup_write():

{
    .name = "memsw.limit_in_bytes",
    .private = MEMFILE_PRIVATE(_MEMSWAP, RES_LIMIT),
    .write_string = mem_cgroup_write,
    .read = mem_cgroup_read,
},

mem_cgroup_write() is defined at:
https://elixir.bootlin.com/linux/v3.10/source/mm/memcontrol.c#L5199

mem_cgroup_write() in turn calls mem_cgroup_resize_memsw_limit() when the type is _MEMSWAP:

else if (type == _MEMSWAP)
    ret = mem_cgroup_resize_memsw_limit(memcg, val);

mem_cgroup_resize_memsw_limit() is defined at:
https://elixir.bootlin.com/linux/v3.10/source/mm/memcontrol.c#L4647

That function calls res_counter_set_limit():
https://elixir.bootlin.com/linux/v3.10/source/include/linux/res_counter.h#L200

That function's implementation is:

unsigned long flags;
int ret = -EBUSY;

spin_lock_irqsave(&cnt->lock, flags);
if (cnt->usage <= limit) {
    cnt->limit = limit;
    ret = 0;
}
spin_unlock_irqrestore(&cnt->lock, flags);
return ret;

Note that ret is initialized to -EBUSY (which corresponds to the Device or resource busy message that you're seeing), and is change to zero only if the current usage is less than or equal to the requested limit. My guess is that in your case it's not, so the function returns -EBUSY.

If res_counter_set_limit() returns a non-zero value to mem_cgroup_resize_memsw_limit(), then mem_cgroup_resize_limit() in turn returns that same value. mem_cgroup_resize_limit() returns the value to mem_cgroup_write(). That return value gets propagated to user space, and is why you see the error that you're seeing from echo.

The implementation is current kernel sources is a bit different, but the behavior is the same. You cannot adjust the minimum value to a value that is less than in-use value.

Related Question