When can Linux boot with a Read-Only Root Filesystem

embeddedfsckroot-filesystemsystemdu-boot

I am working on a embedded Linux with u-boot as bootloader and systemd init system. The tools are limited to standard busybox.

While analyzing some problems, I figured out the root filesystem is read only which causes the problem. The reason is some of the services and programs depend on a writable root fs, causing malfunction.

After researching for a while I found out the root filesystem is only readonly when there is a power failure. (The main unit triggers a power recycle when some error is encountered.) I suspect that when the root fs tries to write to some critical file or updating some service / process, then on power failure a error flag is set on the filesystem. On next boot fsck reads the flag and mounts/remounts the root as read only, or fsck forces to go into some recovery mode (I don't know if any recovery mode exists).

Is my hypothesis correct? If so, then what is a fs flag that is set on the FS on error and how to I prevent the root to boot as RO?

Note:

  1. The root filesystem is mounted with 'errors=continue'. So if fsck reads the superblock for remount option, it should ignore the error and remounts as RW.

  2. I tried to reproduce the case, turning off the power while running a dd command, but never been able to reproduce.

Additional Question: Which udev/systemd magic mounts the root fs?

Best Answer

At boot, you are supposed to check your filesystems to see if the system was shut down properly or if it crashed, and perform the necessary recovery actions in the latter case. On modern journaled filesystems, this usually means a simple and quick journal recovery operation that can be done automatically.

Root filesystem checking and mounting is normally done by initramfs/initrd, but on an embedded system you might or might not have it.

If you are not using initramfs, then the traditional way would be to have the kernel always mount the root filesystem initially as read-only (with boot options root=/dev/<whatever> ro, and the start-up scripts would then first run fsck on it (assuming it's necessary for the filesystem type used) and then remount the root filesystem into read/write mode before doing anything else.

If initramfs did not check the root filesystem (perhaps because it's not being used), then the standard systemd service name for running a filesystem check on the root filesystem is named systemd-fsck-root.service. I could not find out the name of the service responsible for remounting the root filesystem with systemd after it's been checked.

If a boot-time root filesystem check needs to modify the root filesystem, it typically triggers another reboot afterwards, because the modification may have affected something the kernel has already read and is caching, and would now be inconsistent after a correction was made on the disk by fsck.