Linux Kernel – What Does an Idle CPU Process Do?

cpulinux-kernel

Looking at the source of strace I found the use of the clone flag CLONE_IDLETASK which is described there as:

#define CLONE_IDLETASK 0x00001000 /* kernel-only flag */

After looking deeper into it I found that, although that flag is not covered in man clone it is actually used by the kernel during the boot process to create idle processes (all of which should have PID 0) for each CPU on the machine. i.e. a machine with 8 CPUs will have at least 7 (see question below) such processes "running" (note quotes).

Now, this leads me to a couple of question about what that "idle" process actually do. My assumption is that it executes NOP operation continuously until its timeframe ends and the kernel assigns a real process to run or assign the idle process once again (if the CPU is not being used). Yet, that's a complete guess. So:

  1. On a machine with, say, 8 CPUs will 7 such idle processes be created? (and one CPU will be held by the kernel itself whilst no performing userspace work?)

  2. Is the idle process really just an infinite stream of NOP operations? (or a loop that does the same).

  3. Is CPU usage (say uptime) simply calculated by how long the idle process was on the CPU and how long it was not there during a certain period of time?


P.S. It is likely that a good deal of this question is due to the fact that I do not fully understand how a CPU works. i.e. I understand the assembly, the timeframes and the interrupts but I do not know how, for example, a CPU may use more or less energy depending on what it is executing. I would be grateful if someone can enlighten me on that too.

Best Answer

The idle task is used for process accounting, and also to reduce energy consumption. In Linux, one idle task is created for every processor, and locked to that processor; whenever there’s no other process to run on that CPU, the idle task is scheduled. Time spent in the idle tasks appears as “idle” time in tools such as top. (Uptime is calculated differently.)

Unix seems to always have had an idle loop of some sort (but not necessarily an actual idle task, see Gilles’ answer), and even in V1 it used a WAIT instruction which stopped the processor until an interrupt occurred (it stood for “wait for interrupt”). Some other operating systems used busy loops, DOS, OS/2, and early versions of Windows in particular. For quite a long time now, CPUs have used this kind of “wait” instruction to reduce their energy consumption and heat production. You can see various implementations of idle tasks for example in arch/x86/kernel/process.c in the Linux kernel: the basic one just calls HLT, which stops the processor until an interrupt occurs (and enables the C1 energy-saving mode), the other implementations handle various bugs or inefficiencies (e.g. using MWAIT instead of HLT on some CPUs).

All this is completely separate from idle states in processes, when they’re waiting for an event (I/O etc.).

Related Question