Systemd – How to Disable Aggressive Emergency Shell Behavior?

administrationbootsystemd

By default systemd drops to an emergency shell at the slightest error. For example, if one of the mounts at fstab fails for some reason the system becomes unbootable immediately. I manage dozens of diverse production systems and I've found this behavior very damaging. (Actually I think it's a major design failure, but that's a personal opinion).

I'd like to increase the system boot resilience. Optimally the system should always boot up, missing drivers, mounts, etc. shouldn't drop emergency shell, (just show warning instead) unless the given error would render console login absolutely impossible. What can be run, that should be run.

I know systemd automatically generates *.mount files from /etc/fstab and I could use the nofail option with small x-systemd.device timeout (or define the relevant .mount files myself). However it wouldn't solve my problem, I want to make the system more resilient, "patching" fstab every time is not very convenient and I'm not sure how many other possible "problems" exist which would render my system unbootable just because some developer somewhere thought it's important enough.

In sort, I'd like to regain the control over my machine and not let systemd decide what problem is serious enough to crush the boot process. Is it possible?

Best Answer

It is literally only mount failures, that's all you would need to change.

So the letter of your request would be trivial to answer. Create a drop-in file:

# /etc/systemd/system/local-fs.target.d/nofail.conf

# Clear OnFailure= (set it to nothing)
[Unit]
OnFailure=

I believe this will add no new problem, beyond those that linux sysvinit already suffered by allowing this partial failure scenario.


However you also pointed out the question of how long systemd should wait for the specified block devices to become available. I can see no way to configure this, without providing a replacement for the fstab generator as a whole. https://www.freedesktop.org/software/systemd/man/systemd.generator.html

If you dump a large amount of less widely-used code here, it seems unlikely to increase system resilience. I think the closest solution would be to patch the existing fstab generator. It's not massively complex, I suspect you could get away with it / keep up with any significant changes.

Technically, if your distribution had a self-contained mountall sysvinit script, you could try hooking that in. But that will significantly change the boot process - it's actually more of a fork. I would not recommend that approach.


https://unix.stackexchange.com/a/393711/29483

If you search through the unit files, there are only a very few ways for the boot to fall back to emergency.target. It's usually when a .mount unit for a local filesystem fails, causing local-fs.target to fail. Or when your initramfs fails to mount the root filesystem, if your initramfs uses systemd.

local-fs.target has OnFailure=emergency.target. And it gets failed because units for local filesystems are automatically added to the Requires list of local-fs.target (unless they have DefaultDependencies=no).

$ systemctl show --property Requires local-fs.target
Requires=-.mount home.mount boot.mount boot-efi.mount