Linux – How does Linux load the ‘initrd’ image

bootinitrdkernellinux

I've been trying to understand the booting process, but there's just one thing that is going over my head..

As soon as the Linux kernel has been booted and the root file system (/) mounted, programs can be run and further kernel modules can be integrated to provide additional functions. To mount the root file system, certain conditions must be met. The kernel needs the corresponding drivers to access the device on which the root file system is located (especially SCSI drivers). The kernel must also contain the code needed to read the file system (ext2, reiserfs, romfs, etc.). It is also conceivable that the root file system is already encrypted. In this case, a password is needed to mount the file system.

The initial ramdisk (also called initdisk or initrd) solves precisely the problems described above. The Linux kernel provides an option of having a small file system loaded to a RAM disk and running programs there before the actual root file system is mounted. The loading of initrd is handled by the boot loader (GRUB, LILO, etc.). Boot loaders only need BIOS routines to load data from the boot medium. If the boot loader is able to load the kernel, it can also load the initial ramdisk. Special drivers are not required.

If /boot is not a different partition, but is present in the / partition, then shouldn't the boot loader require the SCSI drivers, to access the 'initrd' image and the kernel image? If you can access the images directly, then why exactly do we need the SCSI drivers??

Best Answer

Nighpher, I'll try to answer your question, but for a more comprehensive description of boot process, try this article at IBM.

Ok, I assume, that you are using GRUB or GRUB2 as your bootloader for explanation. First off, when BIOS accesses your disk to load the bootloader, it makes use of its built-in routines for disk access, which are stored in the famous 13h interrupt. Bootloader (and kernel at setup phase) make use of those routines when they access disk. Note that BIOS runs in real mode (16-bit mode) of the processor, thus it cannot address more than 2^20 bytes of RAM (2^20, not 2^16, because each address in real mode is comprised of segment_address*16 + offset, where both segment address and offset are 16-bit, see "x86 memory segmentation" at Wikipedia). Thus, these routines can't access more than 1 MiB of RAM, which is a strict limitation and a major inconvenience.

BIOS loads bootloader code right from the MBR – the first 512 bytes of your disk – and executes it. If you're using GRUB, that code is GRUB stage 1. That code loads GRUB stage 1.5, which is located either in the first 32 KiB of disk space, called DOS compatibility region, or from a fixed address of the file system. It doesn't need to understand the file system structure to do this, because even if stage 1.5 is in the file system, it is "raw" code and can be directly loaded to RAM and executed: See "Details of GRUB on the PC" at pixelbeat.org, which is the source for the below image. Load of stage 1.5 from disk to RAM makes use of BIOS disk access routines.

enter image description here

Stage 1.5 contains the filesystem utilities, so that it can read the stage 2 from the filesystem (well, it still uses BIOS 13h to read from disk to RAM, but now it can decipher filesystem info about inodes, etc., and get raw code out of the disk). Older BIOSes might not be able to access the whole HD due to limitations in their disk addressing mode – they might use Cylinder-Head-Sector system, unable to address more than first 8 GiB of disk space: http://en.wikipedia.org/wiki/Cylinder-head-sector.

Stage 2 loads the kernel into RAM (again, using BIOS disk utilities). If it's 2.6+ kernel, it also has initramfs compiled within, so no need to load it. If it's an older kernel, bootloader also loads standalone initrd image into memory, so that kernel could mount it and get drivers for mounting real file system from disk.

The problem is that the kernel (and ramdisk) weigh more than 1 MiB; thus, to load them into RAM you have to load the kernel into the first 1 MiB, then jump to protected mode (32-bit), move the loaded kernel to high memory (free the first 1 MiB for real mode), then return to real (16-bit) mode again, get ramdisk from disk to first 1 MiB (if it's a separate initrd and older kernel), possibly switch to protected (32-bit) mode again, put it to where it belongs, possibly get back to real mode (or not: https://stackoverflow.com/questions/4821911/does-grub-switch-to-protected-mode) and execute the kernel code. Warning: I'm not entirely sure about thoroughness and accuracy of this part of description.

Now, when you finally run the kernel, you already have it and ramdisk loaded into RAM by bootloader, so the kernel can use disk utilities from ramdisk to mount your real root file system and pivot root to it. ramfs drivers are present in the kernel, so it can understand the contents of initramfs, of course.