Linux – What’s the policy determining which CPU handles which interrupt in the Linux Kernel

interruptkernellinux-kernel

I've been reading Linux Kernel Development and there's something that's not entirely clear to me — when an interrupt is triggered by the hardware, what's the criterion to decide on which CPU to run the interrupt handling logic?

I could imagine it having to be always the same CPU that raised the IO request, but as the thread is for all purposes now sleeping there would not really be that much of a point in doing that.

On the other hand, there may be timing interrupts (for the scheduler, for instance) that need to be raised. On an SMP system are they always raised on the same core (let's say, #0) or they're always pretty much raised at any core?

How does it actually work?

Thanks

Best Answer

On a multiprocessor/multicore system, you might find a daemon process named irqbalance. Its job is to adjust the distribution of hardware interrupts across processors.

At boot time, when the firmware hands over the control of the system to the kernel, initially just one CPU core is running. The first core (usually core #0, sometimes called the "monarch CPU/core") initially takes over all the interrupt handling responsibilities from the firmware before initializing the system and starting up the other CPU cores. So if nothing is done to distribute the load, the core that initially started the system ends up with all the interrupt handling duties.

https://www.kernel.org/doc/Documentation/IRQ-affinity.txt suggests that on modern kernels, all CPU cores are allowed to handle IRQs equally by default. But this might not be the optimal solution, as it may lead to e.g. inefficient use of CPU cache lines with frequent IRQ sources. It is the job of irqbalance to fix that.

irqbalance is not a kernel process: it's a standalone binary /usr/sbin/irqbalance that can run either in one-shot mode (i.e. adjust the distribution of interrupts once as part of the boot process, and exit) or as a daemon. Different Linux distributions can elect to use it differently, or to omit it altogether. It allows easy testing and implementation of arbitrarily complex strategies for assigning IRQs to processors by simply updating the userspace binary.

It works by using per-IRQ /proc/irq/%i/smp_affinity files to control which IRQs can be handled by each CPU. If you're interested in details, check the source code of irqbalance: the actual assignment of IRQ settings happens in activate.c.