Linux – Minimum Root Filesystem Applications for Full Boot

Architectureembeddedlinuxroot-filesystemstartup

It's a question about user space applications, but hear me out!

Three "applications", so to speak, are required to boot a functional distribution of Linux:

  1. Bootloader – For embedded typically that's U-Boot, although not a hard requirement.

  2. Kernel – That's pretty straightforward.

  3. Root Filesystem – Can't boot to a shell without it. Contains the filesystem the kernel boots to, and where init is called form.

My question is in regard to #3. If someone wanted to build an extremely minimal rootfs (for this question let's say no GUI, shell only), what files/programs are required to boot to a shell?

Best Answer

That entirely depends on what services you want to have on your device.

Programs

You can make Linux boot directly into a shell. It isn't very useful in production — who'd just want to have a shell sitting there — but it's useful as an intervention mechanism when you have an interactive bootloader: pass init=/bin/sh to the kernel command line. All Linux systems (and all unix systems) have a Bourne/POSIX-style shell in /bin/sh.

You'll need a set of shell utilities. BusyBox is a very common choice; it contains a shell and common utilities for file and text manipulation (cp, grep, …), networking setup (ping, ifconfig, …), process manipulation (ps, nice, …), and various other system tools (fdisk, mount, syslogd, …). BusyBox is extremely configurable: you can select which tools you want and even individual features at compile time, to get the right size/functionality compromise for your application. Apart from sh, the bare minimum that you can't really do anything without is mount, umount and halt, but it would be atypical to not have also cat, cp, mv, rm, mkdir, rmdir, ps, sync and a few more. BusyBox installs as a single binary called busybox, with a symbolic link for each utility.

The first process on a normal unix system is called init. Its job is to start other services. BusyBox contains an init system. In addition to the init binary (usually located in /sbin), you'll need its configuration files (usually called /etc/inittab — some modern init replacement do away with that file but you won't find them on a small embedded system) that indicate what services to start and when. For BusyBox, /etc/inittab is optional; if it's missing, you get a root shell on the console and the script /etc/init.d/rcS (default location) is executed at boot time.

That's all you need, beyond of course the programs that make your device do something useful. For example, on my home router running an OpenWrt variant, the only programs are BusyBox, nvram (to read and change settings in NVRAM), and networking utilities.

Unless all your executables are statically linked, you will need the dynamic loader (ld.so, which may be called by different names depending on the choice of libc and on the processor architectures) and all the dynamic libraries (/lib/lib*.so, perhaps some of these in /usr/lib) required by these executables.

Directory structure

The Filesystem Hierarchy Standard describes the common directory structure of Linux systems. It is geared towards desktop and server installations: a lot of it can be omitted on an embedded system. Here is a typical minimum.

  • /bin: executable programs (some may be in /usr/bin instead).
  • /dev: device nodes (see below)
  • /etc: configuration files
  • /lib: shared libraries, including the dynamic loader (unless all executables are statically linked)
  • /proc: mount point for the proc filesystem
  • /sbin: executable programs. The distinction with /bin is that /sbin is for programs that are only useful to the system administrator, but this distinction isn't meaningful on embedded devices. You can make /sbin a symbolic link to /bin.
  • /mnt: handy to have on read-only root filesystems as a scratch mount point during maintenance
  • /sys: mount point for the sysfs filesystem
  • /tmp: location for temporary files (often a tmpfs mount)
  • /usr: contains subdirectories bin, lib and sbin. /usr exists for extra files that are not on the root filesystem. If you don't have that, you can make /usr a symbolic link to the root directory.

Device files

Here are some typical entries in a minimal /dev:

  • console
  • full (writing to it always reports “no space left on device”)
  • log (a socket that programs use to send log entries), if you have a syslogd daemon (such as BusyBox's) reading from it
  • null (acts like a file that's always empty)
  • ptmx and a pts directory, if you want to use pseudo-terminals (i.e. any terminal other than the console) — e.g. if the device is networked and you want to telnet or ssh in
  • random (returns random bytes, risks blocking)
  • tty (always designates the program's terminal)
  • urandom (returns random bytes, never blocks but may be non-random on a freshly-booted device)
  • zero (contains an infinite sequence of null bytes)

Beyond that you'll need entries for your hardware (except network interfaces, these don't get entries in /dev): serial ports, storage, etc.

For embedded devices, you would normally create the device entries directly on the root filesystem. High-end systems have a script called MAKEDEV to create /dev entries, but on an embedded system the script is often not bundled into the image. If some hardware can be hotplugged (e.g. if the device has a USB host port), then /dev should be managed by udev (you may still have a minimal set on the root filesystem).

Boot-time actions

Beyond the root filesystem, you need to mount a few more for normal operation:

  • procfs on /proc (pretty much indispensible)
  • sysfs on /sys (pretty much indispensible)
  • tmpfs filesystem on /tmp (to allow programs to create temporary files that will be in RAM, rather than on the root filesystem which may be in flash or read-only)
  • tmpfs, devfs or devtmpfs on /dev if dynamic (see udev in “Device files” above)
  • devpts on /dev/pts if you want to use [pseudo-terminals (see the remark about pts above)

You can make an /etc/fstab file and call mount -a, or run mount manually.

Start a syslog daemon (as well as klogd for kernel logs, if the syslogd program doesn't take care of it), if you have any place to write logs to.

After this, the device is ready to start application-specific services.

How to make a root filesystem

This is a long and diverse story, so all I'll do here is give a few pointers.

The root filesystem may be kept in RAM (loaded from a (usually compressed) image in ROM or flash), or on a disk-based filesystem (stored in ROM or flash), or loaded from the network (often over TFTP) if applicable. If the root filesystem is in RAM, make it the initramfs — a RAM filesystem whose content is created at boot time.

Many frameworks exist for assembling root images for embedded systems. There are a few pointers in the BusyBox FAQ. Buildroot is a popular one, allowing you to build a whole root image with a setup similar to the Linux kernel and BusyBox. OpenEmbedded is another such framework.

Wikipedia has an (incomplete) list of popular embedded Linux distributions. An example of embedded Linux you may have near you is the OpenWrt family of operating systems for network appliances (popular on tinkerers' home routers). If you want to learn by experience, you can try Linux from Scratch, but it's geared towards desktop systems for hobbyists rather than towards embedded devices.

A note on Linux vs Linux kernel

The only behavior that's baked into the Linux kernel is that the first program that's launched at boot time. (I won't get into initrd and initramfs subtleties here.) This program, traditionally called init, has process ID 1 and has certain privileges (immunity to KILL signals) and responsibilities (reaping orphans). You can run a system with a Linux kernel and start whatever you want as the first process, but then what you have is an operating system based on the Linux kernel, and not what is normally called “Linux” — Linux, in the common sense of the term, is a Unix-like operating system whose kernel is the Linux kernel. For example, Android is an operating system which is not Unix-like but based on the Linux kernel.

Related Question