To quote Robert Love:
The scheduler does not magically know whether a process is
interactive. It requires some heuristic that is capable of accurately
reflecting whether a task is I/O-bound or processor-bound. The most
indicative metric is how long the task sleeps. If a task spends
most of its time asleep it is I/O-bound. If a task spends more
time runnable than sleeping, it is not interactive. This extends
to the extreme; a task that spends nearly all the time sleeping is
completely I/O-bound, whereas a task that spends nearly all its time runnable is completely processor-bound.
To implement this heuristic, Linux keeps a running tab on how much
time a process is spent sleeping versus how much time the process
spends in a runnable state. This value is stored in the sleep_avg member
of the task_struct
. It ranges from zero to MAX_SLEEP_AVG
,
which defaults to 10 milliseconds. When a task becomes runnable after
sleeping, sleep_avg
is incremented by how long it slept, until the
value reaches MAX_SLEEP_AVG
. For every timer tick the task runs,
sleep_avg
is decremented until it reaches zero.
So, I believe the kernel decides the scheduling policy based on the above heuristics. As far as I know, for real time processes, the scheduling policy could either be SCHED_FIFO
or SCHED_RR
. Both the policies are similar except that SCHED_RR
has a time slice while SCHED_FIFO
doesn't have any time slice.
However, we can even change the real time process scheduling as well. You could refer to this question on how to change the real time process scheduling.
References
http://www.informit.com/articles/article.aspx?p=101760&seqNum=2
CPUScheduling{Policy|Priority}
The link tells you that CPUSchedulingPriority
should only be set for fifo
or rr
("real-time") tasks. You do not want to force real-time scheduling on services.
CPUSchedulingPolicy=other
is the default.
That leaves batch
and idle
. The difference between them is only relevant if you have multiple idle-priority tasks consuming CPU at the same time. In theory batch
gives higher throughput (in exchange for longer latencies). But it's not a big win, so it's not really relevant in this case.
idle
literally starves if anything else wants the CPU. CPU priority is rather less significant than it used to be, for old UNIX systems with a single core. I would be happier starting with nice
, e.g. nice level 10 or 14, before resorting to idle
. See next section.
However most desktops are relatively idle most of the time. And when you do have a CPU hog that would pre-empt the background task, it's common for the hog only to use one of your CPUs. With that in mind, I would not feel too risky using idle
in the context of an average desktop or laptop. Unless it has an Atom / Celeron / ARM CPU rated at or below about 15 watts; then I would want to look at things a bit more carefully.
Is nice level 'subverted' by the kernel 'autogroup' feature?
Yeah.
Autogrouping is a little weird. The author of systemd
didn't like the heuristic, even for desktops. If you want to test disabling autogrouping, you can set the sysctl kernel.sched_autogroup_enabled
to 0
. I guess it's best to test by setting the sysctl in permanent configuration and rebooting, to make sure you get rid of all the autogroups.
Then you should be able to nice levels for your services without any problem. At least in current versions of systemd - see next section.
E.g. nice level 10 will reduce the weight each thread has in the Linux CPU scheduler, to about 10%. Nice level 14 is under 5%. (Link: full formula)
Appendix: is nice level 'subverted' by systemd cgroups?
The current DefaultCPUAccounting=
setting defaults to off, unless it can be enabled without also enabling CPU control on a per-service basis. So it should be fine. You can check this in your current documentation: man systemd-system.conf
Be aware that per-service CPU control will also be enabled when any service sets CPUAccounting / CPUWeight / StartupCPUWeight / CPUShares / StartupCPUShares.
The following blog extract is out of date (but still online). The default behaviour has since changed, and the reference documentation has been updated accordingly.
As a nice default, if the cpu controller is enabled in the kernel, systemd will create a cgroup for each service when starting it. Without any further configuration this already has one nice effect: on a systemd system every system service will get an even amount of CPU, regardless how many processes it consists off. Or in other words: on your web server MySQL will get the roughly same amount of CPU as Apache, even if the latter consists a 1000 CGI script processes, but the former only of a few worker tasks. (This behavior can be turned off, see DefaultControllers= in /etc/systemd/system.conf.)
On top of this default, it is possible to explicitly configure the CPU shares a service gets with the CPUShares= setting. The default value is 1024, if you increase this number you'll assign more CPU to a service than an unaltered one at 1024, if you decrease it, less.
http://0pointer.de/blog/projects/resources.html
Best Answer
Question 1
It is possible for an user to use real time priority for a process as well. This configuration could be set from
/etc/security/limits.conf
file. I see the below contents in that file.If we check the item section, we see the below entry which enables to set a real time priority for the users.
Question 2 and Question 3
To set scheduling policy to
SCHED_FIFO
, enter:To set scheduling policy to
SCHED_RR
, enter:So to answer question 3, we should verify the scheduling algorithms available and the priorities using the
chrt -m
command and then use any scheduling algorithm that suits our need. To set different priorities, we could use the commands as above.