Linux – Using and understanding systemd scheduling-related options in a desktop context

ionicelinuxniceschedulingsystemd

In systemd service files, one can set the following scheduling related options (from the systemd.exec man page, correct me if I'm wrong):

Nice
Sets the default nice level (scheduling priority) for executed processes. Takes an integer between -20 (highest priority) and 19 (lowest priority). See setpriority(2) for details.

Which is the familiar nice level. It seems its effect is ‘subverted’ somewhat due to the ‘autogroup’ feature of recent linux kernels. So the options below may be what I'd really want to set to keep processes behaving nicely for my desktop experience.

CPUSchedulingPolicy
Sets the CPU scheduling policy for executed processes. Takes one of other, batch, idle, fifo or rr. See sched_setscheduler(2) for details.

CPUSchedulingPriority
Sets the CPU scheduling priority for executed processes. The available priority range depends on the selected CPU scheduling policy (see above). For real-time scheduling policies an integer between 1 (lowest priority) and 99 (highest priority) can be used. See sched_setscheduler(2) for details.

CPUSchedulingResetOnFork
Takes a boolean argument. If true, elevated CPU scheduling priorities and policies will be reset when the executed processes fork, and can hence not leak into child processes. See sched_setscheduler(2) for details. Defaults to false.

I understand the last option. I gather from the explanation of the first two that I can choose a scheduling policy and then, given that policy, a priority. It is not entirely clear to me what I should choose for which kind of tasks. For example, is it safe to choose ‘idle’ for backup tasks (relatively CPU intensive, because deduplicating), or is another one better suited?

In general, getting an understandable overview of each policy, with each of its priorities and suitability for specific purposes is what I am looking for. Also the interaction with the nice level is of interest.

Next to CPU scheduling, there is IO scheduling. I guess this corresponds to ionice (correct me if I'm wrong).

IOSchedulingClass
Sets the I/O scheduling class for executed processes. Takes an integer between 0 and 3 or one of the strings none, realtime, best-effort or idle. See ioprio_set(2) for details.

IOSchedulingPriority
Sets the I/O scheduling priority for executed processes. Takes an integer between 0 (highest priority) and 7 (lowest priority). The available priorities depend on the selected I/O scheduling class (see above). See ioprio_set(2) for details.

We here see the same structure as with the CPU scheduling. I'm looking for the same kind of information as well.

For all the ‘Scheduling’ options, the referred to man pages are not clear enough for me, mostly in translating things to a somewhat technically-inclined desktop user's point of view.

Best Answer

CPUScheduling{Policy|Priority}

The link tells you that CPUSchedulingPriority should only be set for fifo or rr ("real-time") tasks. You do not want to force real-time scheduling on services.

CPUSchedulingPolicy=other is the default.

That leaves batch and idle. The difference between them is only relevant if you have multiple idle-priority tasks consuming CPU at the same time. In theory batch gives higher throughput (in exchange for longer latencies). But it's not a big win, so it's not really relevant in this case.

idle literally starves if anything else wants the CPU. CPU priority is rather less significant than it used to be, for old UNIX systems with a single core. I would be happier starting with nice, e.g. nice level 10 or 14, before resorting to idle. See next section.

However most desktops are relatively idle most of the time. And when you do have a CPU hog that would pre-empt the background task, it's common for the hog only to use one of your CPUs. With that in mind, I would not feel too risky using idle in the context of an average desktop or laptop. Unless it has an Atom / Celeron / ARM CPU rated at or below about 15 watts; then I would want to look at things a bit more carefully.

Is nice level 'subverted' by the kernel 'autogroup' feature?

Yeah.

Autogrouping is a little weird. The author of systemd didn't like the heuristic, even for desktops. If you want to test disabling autogrouping, you can set the sysctl kernel.sched_autogroup_enabled to 0. I guess it's best to test by setting the sysctl in permanent configuration and rebooting, to make sure you get rid of all the autogroups.

Then you should be able to nice levels for your services without any problem. At least in current versions of systemd - see next section.

E.g. nice level 10 will reduce the weight each thread has in the Linux CPU scheduler, to about 10%. Nice level 14 is under 5%. (Link: full formula)

Appendix: is nice level 'subverted' by systemd cgroups?

The current DefaultCPUAccounting= setting defaults to off, unless it can be enabled without also enabling CPU control on a per-service basis. So it should be fine. You can check this in your current documentation: man systemd-system.conf

Be aware that per-service CPU control will also be enabled when any service sets CPUAccounting / CPUWeight / StartupCPUWeight / CPUShares / StartupCPUShares.

The following blog extract is out of date (but still online). The default behaviour has since changed, and the reference documentation has been updated accordingly.

As a nice default, if the cpu controller is enabled in the kernel, systemd will create a cgroup for each service when starting it. Without any further configuration this already has one nice effect: on a systemd system every system service will get an even amount of CPU, regardless how many processes it consists off. Or in other words: on your web server MySQL will get the roughly same amount of CPU as Apache, even if the latter consists a 1000 CGI script processes, but the former only of a few worker tasks. (This behavior can be turned off, see DefaultControllers= in /etc/systemd/system.conf.)

On top of this default, it is possible to explicitly configure the CPU shares a service gets with the CPUShares= setting. The default value is 1024, if you increase this number you'll assign more CPU to a service than an unaltered one at 1024, if you decrease it, less.

http://0pointer.de/blog/projects/resources.html