Linux process “scheduling”

kernellinux

I've seen it written many times that the Linux scheduler schedules processes. I'm teaching a course on multithreaded programming, and would like to get my terminology straight. I have one thing I would like to say about it (written below), hoping someone can help me clear out the most egregious errors:

It's not the process that the scheduler schedules, it's the thread
associated to that process. The process is simply a bunch of memory
mapping segments, and thus static. We can see this clearly when we
pthread_create() or even clone() (mostly, but not exactly, the
same), whereby one process has several threads, and it is those that
are scheduled (otherwise you would only schedule the process thread
(the PID=TID one), rather than any other. I assume the ambiguity is
due to the fact that all processes have at least one thread of
execution.

Is this the correct (although simplified) picture?

Best Answer

Try something like this:

All processes start with just one thread, and can create more, using pthread_create for example. (All of a process's threads created this way share the same address space.) The kernel's scheduler works on these threads, regardless of whether they're a process's "main"/initial thread or additional ones - there's essentially no difference between them from scheduler's point of view.

Linux initially didn't have threads at all, only processes. So the part of the OS that schedules "CPU work" is generally called the process scheduler, for historical reasons. (This isn't Linux-specific, same thing for most (all?) Unix-type systems. Thread scheduler is simply not the usual vocabulary used.)

I wouldn't even mention clone (let alone vfork) at that point, unless you've already explained the whole namespaces business.

Related Solutions

Linux – ionice `none: prio 0` equivalent to

1) From the docs sched-design-CFS.txt:

CFS stands for "Completely Fair Scheduler," and is the new "desktop" process scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. It is the replacement for the previous vanilla scheduler's SCHED_OTHER interactivity code.

It seems you mixing up the O(1) scheduler with the CFQ io-scheduler.

So there are SCHED_{NORMAL, BATCH, IDLE} policies. IDLE does not have any priorities. And sched classes idle, best-effort and realtime.

2) Sadly you do not show what commands you typed. For example change init's io-scheduling to best-effort class

# ionice -p 1
none: prio 0
# ionice -c2 20 -p 1
# ionice -p 1
best-effort: prio 4

Relationship between IO scheduler and cpu/process scheduler

Let's start with the IO scheduler first. There's a IO scheduler per block device. Its job is to schedule (order) the requests that pile up in the device queue. There are three different algorithms currently shipped in the linux kernel: deadline, noop and cfq. cfq is the default, and according to its doc:

The CFQ I/O scheduler tries to distribute bandwidth equally among all processes in the system. It should provide a fair and low latency working environment, suitable for both desktop and server systems

You can configure which scheduler governs which device via the scheduler file corresponding to your block device under /sys/ (You can issue the following command to find it: find /sys | grep queue/scheduler).

What that short description doesn't say is that cfq is the only scheduler that looks at the ioprio of a process. ioprio is a setting that you can assign to the process, and the algorithm will take that into account when choosing a request before another. ioprio can be set via the ionice utility.

Then, there's the task scheduler. Its job is to allocate the CPUs amongst the processes that are ready to run. It takes into account things like the priority, the class and the niceness of a give process, as well as how long that process has run and other heuristics.

Now, to your questions:

What is relationship between IO scheduler and CPU scheduler?

Not much, besides the name. They schedule different shared resources. The first one orders the requests going to the disks, and the second one schedules the 'requests' (you can view a process as requesting CPU time to be able to run) to the CPU.

CPU scheduling happens first. IO scheduler is a thread itself and subject to CPU scheduling.

It doesn't happen like the the IO scheduler algorithm is run by whichever process is queuing a request. A good way to see this is to look at crashes that have elv_add_request() in their path. For example:

 [...]
 [<c027fac4>] error_code+0x74/0x7c
 [<c019ed65>] elv_next_request+0x6b/0x116
 [<e08335db>] scsi_request_fn+0x5e/0x26d [scsi_mod]
 [<c019ee6a>] elv_insert+0x5a/0x134
 [<c019efc1>] __elv_add_request+0x7d/0x82
 [<c019f0ab>] elv_add_request+0x16/0x1d
 [<e0e8d2ed>] pkt_generic_packet+0x107/0x133 [pktcdvd]
 [<e0e8d772>] pkt_get_disc_info+0x42/0x7b [pktcdvd]
 [<e0e8eae3>] pkt_open+0xbf/0xc56 [pktcdvd]
 [<c0168078>] do_open+0x7e/0x246
 [<c01683df>] blkdev_open+0x28/0x51
 [<c014a057>] __dentry_open+0xb5/0x160
 [<c014a183>] nameidata_to_filp+0x27/0x37
 [<c014a1c6>] do_filp_open+0x33/0x3b
 [<c014a211>] do_sys_open+0x43/0xc7
 [<c014a2cd>] sys_open+0x1c/0x1e
 [<c0102b82>] sysenter_past_esp+0x5f/0x85

Notice how the process enters the kernel calling open(), and this ends up involving the elevator (elv) algorithm.

Best Answer

Related Solutions

Linux – ionice `none: prio 0` equivalent to

Relationship between IO scheduler and cpu/process scheduler

Related Question