Cgroups – How to Limit All Processes Except Whitelist to a Single CPU

cgroupscpu usageresources

There is a guide to cgroups from Red Hat which is maybe sort of kind of helpful (but doesn't answer this question).

I know how to limit a specific process to a specific CPU, during the command to start that process, by:

First, putting the following* in /etc/cgconfig.conf:

mount {
  cpuset =  /cgroup/cpuset;
  cpu =     /cgroup/cpu;
  cpuacct = /cgroup/cpuacct;
  memory =  /cgroup/memory;
  devices = /cgroup/devices;
  freezer = /cgroup/freezer;
  net_cls = /cgroup/net_cls;
  blkio =   /cgroup/blkio;
}

group cpu0only {
  cpuset {
    cpuset.cpus = 0;
    cpuset.mems = 0;
  }
}

And then start a process and assign it specifically to that cgroup by using:

cgexec -g cpuset:cpu0only myprocessname

I can limit all instances of a specific process name automatically by (I think this is correct) putting the following in /etc/cgrules.conf:

# user:process  controller  destination
*:myprocessname cpuset      cpu0only

My question is: How can I do the reverse?

In other words, How can I assign all processes except for a specific set of whitelisted processes and their children to a restricted cgroup?

Based on what I have studied, but haven't tested, I believe that a partial solution would be:

Add an "unrestricted" cgroup:

group anycpu {
  cpuset {
    cpuset.cpus = 0-31;
    cpuset.mems = 0;  # Not sure about this param but it seems to be required
  }
}

Assign my process explicitly to the unrestricted group, and everything else to the restricted group:

# user:process  controller  destination
*:myprocessname cpuset      anycpu
*               cpuset      cpu0only

However, the caveat on this seems to be (from reading the docs, not from testing, so grain of salt) that the children of myprocessname will be reassigned to the restricted cpu0only cgroup.

A possible alternative approach would be to create a user to run myprocessname and have all of that user's processes unrestricted, and everything else restricted. However, in my actual use case, the process needs to be run by root, and there are other processes that also must be run by root which should be restricted.

How can I accomplish this with cgroups?

If this is not possible with cgroups (which I now suspect is the case), are my ideas of partial solutions correct and will they work as I think?

_{*Disclaimer: This is probably not a minimal code example;I don't understand all the parts so I don't know which are not necessary.}

Best Answer

UPDATE: Note that the answer below applies to RHEL 6. In RHEL 7, most cgroups are managed by systemd, and libcgroup is deprecated.

Since posting this question I have studied the entire guide that I linked to above, as well as the majority of the cgroups.txt documentation and cpusets.txt. I now know more than I ever expected to learn about cgroups, so I'll answer my own question here.

There are multiple approaches you can take. Our company's contact at Red Hat (a Technical Architect) recommended against a blanket restriction of all processes in preference to a more declarative approach—restricting only the processes we specifically wanted restricted. The reason for this, according to his statements on the subject, is that it is possible for system calls to depend on user space code (such as LVM processes) which if restricted could slow the system down—the opposite of the intended effect. So I ended up restricting several specifically-named processes and leaving everything else alone.

Additionally, I want to mention some cgroup basic data that I was missing when I posted my question.

Cgroups do not depend on libcgroup being installed. However, that is a set of tools for automatically handling cgroup configuration and process assignments to cgroups and can be very helpful.

I found that the libcgroup tools can also be misleading, because the libcgroup package is built on its own set of abstractions and assumptions about your use of cgroups, which are slightly different than the actual kernel level implementation of cgroups. (I can put examples but it would take some work; comment if you're interested.)

Therefore, before using libcgroup tools (such as /etc/cgconfig.conf, /etc/cgrules.conf, cgexec, cgcreate, cgclassify, etc.) I highly recommend getting very familiar with the /cgroup virtual filesystem itself, and manually creating cgroups, cgroup hierarchies (including hierarchies with multiple subsystems attached, which libcgroup sneakily and leakily abstracts away), reassigning processes to different cgroups by running echo $the_pid > /cgroup/some_cgroup_hierarchy/a_cgroup_within_that_hierarchy/tasks, and other seemingly magical tasks that libcgroup performs under the hood.

Another basic concept I was missing was that if the /cgroup virtual filesystem is mounted on your system at all (or more accurately, if any of the cgroup subsystems aka "controllers" are mounted at all), then every process on your entire system is in a cgroup. There is no such thing as "some processes are in a cgroup and some aren't".

There is what is called the root cgroup for a given hierarchy, which owns all the system's resources for the attached subsystems. For example a cgroup hierarchy that has the cpuset and blkio subsystems attached, would have a root cgroup which would own all the cpus on the system and all the blkio on the system, and could share some of those resources with child cgroups. You can't restrict the root cgroup because it owns all your system's resources, so restricting it wouldn't even make sense.

Some other simple data I was missing about libcgroup:

If you use /etc/cgconfig.conf, you should ensure that chkconfig --list cgconfig shows that cgconfig is set to run at system boot.

If you change /etc/cgconfig.conf, you need to run service cgconfig restart to load in the changes. (And problems with stopping the service or running cgclear are very common when fooling around testing. For debugging I recommend, for example, lsof /cgroup/cpuset, if cpuset is the name of the cgroup hierarchy you are using.)

If you want to use /etc/cgrules.conf, you need to ensure the "cgroup rules engine daemon" (cgrulesengd) is running: service cgred start and chkconfig cgred on. (And you should be aware of a possible but unlikely race condition regarding this service, as described in the Red Hat Resource Management Guide in section 2.8.1 at the bottom of the page.)

If you want to fool around manually and set up your cgroups using the virtual filesystem (which I recommend for first use), you can do so and then create a cgconfig.conf file to mirror your setup by using cgsnapshot with its various options.

And finally, the key piece of info I was missing when I wrote the following:

However, the caveat on this seems to be...that the children of myprocessname will be reassigned to the restricted cpu0only cgroup.

I was correct, but there is an option I was unaware of.

cgexec is the command to start a process/run a command and assign it to a cgroup.

cgclassify is the command to assign an already running process to a cgroup.

Both of these will also prevent cgred (cgrulesengd) from reassigning the specified process to a different cgroup based on /etc/cgrules.conf.

Both cgexec and cgclassify support the --sticky flag, which additionally prevents cgred from reassigning child processes based on /etc/cgrules.conf.

So, the answer to the question as I wrote it (though not the setup I ended up implementing, because of the advice from our Red Hat Technical Architect mentioned above) is:

Make the cpu0only and anycpu cgroup as described in my question. (Ensure cgconfig is set to run at boot.)

Make the * cpuset cpu0only rule as described in my question. (And ensure cgred is set to run at boot.)

Start any processes I want unrestricted with: cgexec -g cpuset:anycpu --sticky myprocessname.

Those processes will be unrestricted, and all their child processes will be unrestricted as well. Everything else on the system will be restricted to CPU 0 (once you reboot, since cgred doesn't apply cgrules to already running processes unless they change their EUID). This is not completely advisable, but that was what I initially requested and it can be done with cgroups.

Related Solutions

Limit CPU usage % all processes and cores

After a few days of intensive research I found two methods of lowering the cpu usage for processes. Generally if you want to lower the cpu usage of the entire machine, there are probably some programs which use the most long-lasting cpu and you should restrict those rather than burden the entire machine. And if you doing this to save battery life then you might also want to control the hardware's power usage with e.g. tuned or powertop, most distros have tools to help you with that.

Stop/Continue signals

Signals were around since UNIX. You send a SIGSTOP or SIGTSTP to a process (the difference is the former may crash a process if it must do cleanup work, processes are not forced to stop at the latter, use the one that suits your process) to pause it (freeing the CPU and possibly lowering temperature). Then you send a SIGCONT to the process to resume it, taking the CPU. This method will make a series of "spikes" on the cpu graph and will stop the processor from overheating because you're not giving it enough time for that by pausing processes.

A consequence of this method is that these pauses are not smooth, meaning video playing and even web browsing won't be smooth either, so you may want to use this method with shell commands (multi-process programs or commands like Google Chrome or make won't work well with this method either). Obviously, it's not recommended to pause/resume system processes like systemd.

Although you could do this manually, cpulimit is a nice small program that uses this method (it uses SIGSTOP/SIGCONT). Contrary to the description, the cpu % you specify is between 0 and 100 even if you have multiple cores. And you can always suspend a job in the terminal with Ctrl-Z.

cpupower (highly recommended)

This one is built in the Linux kernel so most distros should provide it (get it here if you don't have it). This command-line utility manages the CPU frequency so it pretty much controls the entire cpu using governor states (e.g., performance, powersave, etc.), it can also do other things. Unlike the pause/resume method, processes are much smoother with this. You'll need to set the maximum frequency for the processor.

Run cpupower frequency-info to see your available processor states.
As root, type cpupower frequency-set -u <frequency>, start with the lowest one you have and then try to find the highest frequency that doesn't overheat.
(This is optional) If you want, you can install the lm_sensors package which allows you to see your system temperature. Then run sensors-detect and answer 'yes' to all the questions. Last, run sensors to see the current and critical (beyond which the system overheats) temperatures.

At this point the current temperature should be much lower now. Be aware though some performance-intensive programs like games might hang after typing the above command, if you get a popup window with that message you should wait for the program rather than force quit. Note you have to type this command every time the system reboots (unless you can get it to run automatically). See this and this for more information on cpupower.

What’s “broken” about cpuset cgroup inheritance semantics in the Linux kernel

I'm not nearly well-versed enough with cgroups to give a definitive answer (and I certainly don't have experience with cgroups going back to 2013!) but on a vanilla Ubuntu 16.04 cgroups v1 seems to have it's act together:

I devised a small test that forces forking as a different user using a child sudo /bin/bash spun off with & - the -H flag is extra paranoia to force sudo to execute with root's home environment.

cat <(whoami) /proc/self/cgroup >me.cgroup && \
sudo -H /bin/bash -c 'cat <(whoami) /proc/self/cgroup >you.cgroup' & \
sleep 2 && diff me.cgroup you.cgroup

This yields:

1c1
< admlocal
---
> root

For reference, this is the structure of cgroup mounts on my system:

$ mount | grep group
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
$

Best Answer

Related Solutions

Limit CPU usage % all processes and cores

What’s “broken” about cpuset cgroup inheritance semantics in the Linux kernel

Related Question