Linux Kernel – How a Keyboard Press is Processed

driversinterruptkeyboardlinux-kernel

I'm currently learning about the Linux Kernel and OSes in general, and while I have found many great resources concerning IRQs, Drivers, Scheduling and other important OS concepts, as well as keyboard-related resources, I am having a difficult time putting together a comprehensive overview of how the Linux Kernel handles a button press on a keyboard. I'm not trying to understand every single detail at this stage, but am rather trying to connect concepts, somewhat comprehensively.

I have the following scenario in mind:

  1. I'm on a x64 machine with a single processor.
  2. There're a couple of processes running, notably the Editor VIM (Process #1) and say LibreOffice (Process #2).
  3. I'm inside VIM and press the a-key. However, the process that's currently running is Process #2 (with VIM being scheduled next).

This is how I imagine things to go down right now:

  1. The keyboard, through a series of steps, generates an electrical signal (USB Protocol Encoding) that it sends down the USB wire.
  2. The signal gets processed by a USB-Controller, and is send through PCI-e (and possibly other controllers / buses?) to the Interrupt Controller (APIC). The APIC triggers the INT Pin of the processor.
  3. The processor switches to Kernel Mode and request an IRQ-Number from the APIC, which it uses as an offset into the Interrupt Descriptor Table Register (IDTR). A descriptor is obtained, that is then used to obtain the address of the interrupt handler routine. As I understand it, this interrupt handler was initially registered by the keyboard driver?
  4. The interrupt handler routine (in this case a keyboard handler routine) is invoked.

This brings me to my main question: By which mechanism does the interrupt handler routine communicate the pressed key to the correct Process (Process #1)? Does it actually do that, or does it simply write the pressed key into a buffer (available through a char-device?), that is read-only to one process at a time (and currently "attached" to Process #1)? I don't understand at which time Process #1 receives the key. Does it process the data immediately, as the interrupt handler schedules the process immediately, or does it process the key data the next time that the scheduler schedules it?

  1. When this handler returns (IRET), the context is switched back to the previously executing process (Process #2).

Best Answer

Your understanding so far is correct, but you miss most of the complexity that's built on that. The processing in the kernel happens in several layers, and the keypress "bubbles up" through the layers.

The USB communication protocol itself is a lot more involved. The interrupt handler routine for USB handles this, and assembles a complete USB packet from multiple fragments, if necessary.

The key press uses the so-called HID ("Human interface device") protocol, which is built on top of USB. So the lower USB kernel layer detects that the complete message is a USB HID event, and passes it to the HID layer in the kernel.

The HID layer interprets this event according to the HID descriptor it has required from the device on initialization. It then passes the events to the input layer. A single HID event can generate multiple key press events.

The input layer uses kernel keyboard layout tables to map the scan code (position of the key on the keyboard) to a key code (like A) and interprets Shift, Alt, etc. The result of this interpretation is made available via /dev/input/event* to userland processes. You can use evtest to watch those events in real-time.

But processing is not finished here. The X Server (responsible for graphics) has a generic evdev driver that reads events from /dev/input/event* devices, and then maps them again according to a second set of keyboard layout tables (you can see those partly with xmodmap and fully via the XKBD extension). This is because the X server predates the kernel input layer, and in earlier times had drivers to handle mouse and PS/2 keys directly.

Then the X server sends a message to the X client (application) containing the keyboard event. You can see those messages with the xev application. LibreOffice will process this event directly, VIM will be running in an xterm which will process the event, and (you guessed it) again add some extra processing to it, and finally pass it to VIM via stdin.

Complicated enough?