Linux CPU – How Pre-emptive Scheduling Is Accomplished

cpulinuxprocess-management

I am reading up on Linux processes from The Linux Documentation Project: https://www.tldp.org/LDP/tlk/kernel/processes.html

Processes are always making system calls and so may often need to wait. Even so, if a process executes until it waits then it still might use a disproportionate amount of CPU time and so Linux uses pre-emptive scheduling. In this scheme, each process is allowed to run for a small amount of time, 200ms, and, when this time has expired another process is selected to run and the original process is made to wait for a little while until it can run again. This small amount of time is known as a time-slice.

My question is, how is this time being kept track of? If the process is currently the only one occupying the CPU, then there is nothing actually checking if the time has expired, right?

I understand that processes jump to syscalls and those jump back to the scheduler, so it makes sense how processes can be “swapped” in that regards. But how is Linux capable of keeping track how much time a process has had on the CPU? Is it only possible via hardware timers?

Best Answer

The short answer is yes. All practical approaches to preemption will use some sort of CPU interrupt to jump back into privileged mode, i.e. the linux kernel scheduler.

If you look at your /proc/interrupts you'll find the interrupts used in the system, including timers.

Note that linux has several different types of schedulers, and the classic periodic timer style, is seldom used - from the Completely Fair Scheduler (CFS) documentation:

CFS uses nanosecond granularity accounting and does not rely on any jiffies or other HZ detail. Thus the CFS scheduler has no notion of “timeslices” in the way the previous scheduler had, and has no heuristics whatsoever.

Also, when a program issues a system call (Usually by a software interrupt - "trap"), the kernel is also able to preempt the calling program, this is especially evident with system calls waiting for data from other processes.