10 minutes is very much long-term as far as Linux's scheduler is concerned. Time slices are something like 10ms.
When you're looking at CPU usage percentages, keep in mind that top
adds up the per-thread usage of multi-threaded processes. So a 10-thread process that has each thread getting 10% active time will show up as using 100% of a CPU.
Linux's scheduler won't starve a nice 19
task (because deadlock bugs are hard to avoid if a process can be descheduled forever), so even nice 19
won't stop a task from getting some CPU time. If it has a lot of threads, it may still use significant CPU resources.
If some of the processes are blocking on I/O, especially virtual memory paging, their CPU usage % will go way down. Run something like dstat
to see CPU usage breakdowns, disk, network, paging, and context switches. It's like vmstat
but colourized and nicer.
Make sure your processes really are niced the way you think they are, by looking at the NI
column in top. (It's unlikely that different threads in the same process will have different nice levels, but I think possible.)
If you've been using renice
, remember that it's not recursive. renice-ing a parent process won't affect existing children, only future children.
CPUScheduling{Policy|Priority}
The link tells you that CPUSchedulingPriority
should only be set for fifo
or rr
("real-time") tasks. You do not want to force real-time scheduling on services.
CPUSchedulingPolicy=other
is the default.
That leaves batch
and idle
. The difference between them is only relevant if you have multiple idle-priority tasks consuming CPU at the same time. In theory batch
gives higher throughput (in exchange for longer latencies). But it's not a big win, so it's not really relevant in this case.
idle
literally starves if anything else wants the CPU. CPU priority is rather less significant than it used to be, for old UNIX systems with a single core. I would be happier starting with nice
, e.g. nice level 10 or 14, before resorting to idle
. See next section.
However most desktops are relatively idle most of the time. And when you do have a CPU hog that would pre-empt the background task, it's common for the hog only to use one of your CPUs. With that in mind, I would not feel too risky using idle
in the context of an average desktop or laptop. Unless it has an Atom / Celeron / ARM CPU rated at or below about 15 watts; then I would want to look at things a bit more carefully.
Is nice level 'subverted' by the kernel 'autogroup' feature?
Yeah.
Autogrouping is a little weird. The author of systemd
didn't like the heuristic, even for desktops. If you want to test disabling autogrouping, you can set the sysctl kernel.sched_autogroup_enabled
to 0
. I guess it's best to test by setting the sysctl in permanent configuration and rebooting, to make sure you get rid of all the autogroups.
Then you should be able to nice levels for your services without any problem. At least in current versions of systemd - see next section.
E.g. nice level 10 will reduce the weight each thread has in the Linux CPU scheduler, to about 10%. Nice level 14 is under 5%. (Link: full formula)
Appendix: is nice level 'subverted' by systemd cgroups?
The current DefaultCPUAccounting=
setting defaults to off, unless it can be enabled without also enabling CPU control on a per-service basis. So it should be fine. You can check this in your current documentation: man systemd-system.conf
Be aware that per-service CPU control will also be enabled when any service sets CPUAccounting / CPUWeight / StartupCPUWeight / CPUShares / StartupCPUShares.
The following blog extract is out of date (but still online). The default behaviour has since changed, and the reference documentation has been updated accordingly.
As a nice default, if the cpu controller is enabled in the kernel, systemd will create a cgroup for each service when starting it. Without any further configuration this already has one nice effect: on a systemd system every system service will get an even amount of CPU, regardless how many processes it consists off. Or in other words: on your web server MySQL will get the roughly same amount of CPU as Apache, even if the latter consists a 1000 CGI script processes, but the former only of a few worker tasks. (This behavior can be turned off, see DefaultControllers= in /etc/systemd/system.conf.)
On top of this default, it is possible to explicitly configure the CPU shares a service gets with the CPUShares= setting. The default value is 1024, if you increase this number you'll assign more CPU to a service than an unaltered one at 1024, if you decrease it, less.
http://0pointer.de/blog/projects/resources.html
Best Answer
The proportion of the processor time a particular process receives is determined by the relative difference in niceness between it and other runnable processes.
The Linux Completely Fair Scheduler (CFS) calculates a weight based on the niceness. The weight is roughly equivalent to
1024 / (1.25 ^ nice_value)
. As the nice value decreases the weight increases exponentially. The timeslice allocated for the process is proportional to the weight of the process divided by the total weight of all runnable processes. The implementation of the CFS is inkernel/sched/fair.c
.The CFS has a target latency for the scheduling duration. Smaller target latencies yield better interactivity, but as the target latency decreases, the switching overhead increases, thus decreasing the overall throughput.
Given for instance a target latency of 20 milliseconds and two runnable processes of equal niceness, then both processes will run for 10 milliseconds each before being pre-empted in favour of the the other process. If there are 10 processes of equal niceness, each runs for 2 milliseconds each.
Now consider two processes, one with a niceness of 0 (the default), the other with a niceness of 5. The proportional difference between the corresponding weights is roughly 1/3, meaning that the higher priority process receives a timeslice of approximately 15 milliseconds while the lower priority process receives a timeslice of 5 milliseconds.
Lastly consider two processes with the niceness values of 5 and 10 respectively. While the absolute niceness is larger in this case, the relative differences between the niceness values is the same as in the previous example, yielding an identical timeslice division.