How does a bootloader actually hand off to a kernel

bootboot-loadergrub2kernel

Assume I have a bootloader with a kernel and initrd line. For all intents and purposes, I now have 2 or 3 stages of "kernel" going on:

  1. firmware
  2. grub (or other bootloader)
  3. actual linux kernel

The above would be for MBR. For EFI, the bootloader (or bootmanager) is just an EFI app which runs while the firmware is the "kernel":

  1. firmware, loads EFI app
  2. actual linux kernel

What is the actual kernel hand-off process. What does MBR grub actually do to go from 2 to 3, or the EFI firmware to go from 1 to 2? Is it similar to kexec?

Second, in the case of EFI, where some hooks are passed to the EFI app and thence to the Linux kernel (so we can do things like efibootmgr, etc.), how is that handed off?

Finally, is it possible to do that more than once? E.g. if I have some custom work I need to do before loading my "regular" OS, e.g. measuring and validating TPM entries, decryption, etc., perhaps things that are not easily done using grub, rEFInd or others, could I load an "interim" stage kernel and initrd, do them, and then hand off?

Best Answer

Some notes, mainly on BIOS/GRUB systems.

BIOS system with GRUB:

BIOS start of from address 0xfffffff0 (x86).

Do various tests e.g. POST. If all well then check the devices, in the order configured and saved in CMOS. First boot device that has a valid MBR, (signature at offset 510 is 0x55aa), is loaded into memory at address 0x7c00.

Then BIOS leaves control to what ever code (bytes) is loaded from MBR at offset 0. That is; the data where control is left of should be processor instructions. A program.

For example, if you look at a MBR image you likely find something like eb6390 at the beginning. This translates to two machine instructions:

eb63 => jump to 0x63 (offset 0x65 in MBR as count is from end of instruction)
90   => No Operation
  • boot.S in the GRUB source. First instructions in MBR in assembly:

    jmp LOCAL(after_BPB)
    nop
    

From here on GRUB loads next stage. Typically first sector of core.img

  • diskboot.S in the GRUB source on a normal disk boot.

A jump is done to this code which then load the rest of core.img. This includes for example Reed-Solomon error correction, decompression etc. from e.g. startup_raw.S. Current GRUB is module based, and these are also loaded at this stage.

The GRUB configuration files are read etc. and when it has determined which kernel to run it loads it from the /boot directory into memory. Then the initial RAM disk image, initrd, is loaded into memory.

The boot loader also write the memory address of configuration strings into the kernels memory space. I.e. boot options. See header fields marked "modify".

Also note that the boot loader is normally alternating between real and protected mode during the load stage. This to be able to load data beyond the 1 MB limit.

When this is done the boot loader leaves control to the kernel, just like BIOS left control to the boot loader trough MBR. This is done in real mode.

The kernel is (usually) module based. Among the modules are for example file system modules. At startup the kernel likely have to read files from a file system that it needs a module to read … this is where initrd comes in to work. The modules needed to get started reside here.

(U)EFI:

The (U)EFI boot process can go much in the same track as the BIOS/GRUB if one use an uefigrub install etc. One also have the option of using the EFI Boot Stub which allows EFI firmware to load the kernel as an EFI executable.

Further, as of kernel 3.14 , kexec is also an available but is not intended for cold boot.

Related Question